Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scale invariability of pathway amplitudes and redundancy with V0 #76

Closed
luisfabib opened this issue Feb 8, 2021 · 7 comments · Fixed by #108
Closed

Scale invariability of pathway amplitudes and redundancy with V0 #76

luisfabib opened this issue Feb 8, 2021 · 7 comments · Fixed by #108
Labels
fundamental An issue with fundamental concepts or problems
Milestone

Comments

@luisfabib
Copy link
Collaborator

In the design of DeerLab seems to be some redundancy and invariability issues related to how the dipolar pathway amplitudes are defined and handled.

Scale invariability of dipolar pathways

In experiment models (except ex_4pdeer) where the unmodulated contribution is parametrized (Lam0), the absolute values of the pathway amplitudes result in a scale-invariant response in DeerLab. For example the simulations for V1 and V2 here

    V1= dl.dipolarkernel(t,r,dl.ex_5pdeer([0.8, 0.4, 0.2, 4]))@P
    V2= dl.dipolarkernel(t,r,dl.ex_5pdeer([0.4, 0.2, 0.1, 4]))@P
    np.array_equal(V1,V2)
    >> True

will result in exactly the same signal. The reason behind this is the default behavior of dipolarkernel, which normalizes the dipolar kernel after calculations (controlled by the option renormalize). BTW there is an additional option called renormpaths which has no effect on the output and will be deprecated in a future PR).

As expected by setting renormalize=False the scale-invariability of the dipolar pathways is lifted

    V1= dl.dipolarkernel(t,r,dl.ex_5pdeer([0.8, 0.4, 0.2, 4]),renormalize=False)@P
    V2= dl.dipolarkernel(t,r,dl.ex_5pdeer([0.4, 0.2, 0.1, 4]),renormalize=False)@P
    np.array_equal(V1,V2)
    >> False

Redundancy with V0

Setting renormalize=False reveals a second issue: the absolute values of the pathway amplitudes and the dipolar signals scale 'V0' become redundant. For example the simulations for V1 and V2 here

    V0 = 1
    V1= V0*dl.dipolarkernel(t,r,dl.ex_5pdeer([0.8, 0.4, 0.2, 4]),renormalize=False)@P
    V0 = 2
    V2= V0*dl.dipolarkernel(t,r,dl.ex_5pdeer([0.4, 0.2, 0.1, 4]),renormalize=False)@P
    np.array_equal(V1,V2)
    >> True

the signals are again equal. Only by reinstating the kernel normalization can these be distinguished again.

Summary

  • Normalization of the dipolar kernel (i.e. enforcing K(0,r)=1) is required for V0 in the separation V(t)=V0*Vintra(t)*Vinter(t) to be distinguishable.
  • Under normalization, pathway amplitudes in models of the form V0*(Lam0 + sum_i lam_i*Vintra_i) exhibit scale invariability.

The question now is: what is the optimal and correct way to treat this?
On one hand, pathway amplitudes are probabilistic quantities so it would make sense to think of them as being confined to a range [0,1]. On the other hand, we could imagine them relative contributions to V0, i.e. such that the sum over all dipolar pathways equals V0. This second interpretation might be more efficient due to the fact that the ex_4pdeer does not suffer from the issues described here, as the contributions are also define in a relative way.

This requires further discussion.

Here is a full working example with a graphical representation of these issues:

import deerlab as dl
import numpy as np 
import matplotlib.pyplot as plt 

t = np.linspace(-0.1,7,200)
r = np.linspace(2,6,200)
P = dl.dd_gauss(r,[3,0.2])

def V5pdeer_default(lam):
    K = dl.dipolarkernel(t,r,dl.ex_5pdeer([lam[0],lam[1],lam[2],4.23]),lambda lam:dl.bg_hom3d(t,50,lam),renormpaths=False)
    return K@P

def V5pdeer(lam):
    K = dl.dipolarkernel(t,r,dl.ex_5pdeer([lam[0],lam[1],lam[2],4.23]),lambda lam:dl.bg_hom3d(t,50,lam),renormalize=False,renormpaths=False)
    return K@P

plt.figure(figsize=[7,6])
plt.subplot(221)
plt.plot(t,V5pdeer_default([0.8,0.6,0.4]),'-',linewidth=2)
plt.plot(t,V5pdeer_default([0.4,0.3,0.2]),'--',linewidth=2)
plt.xlabel('t [us]')
plt.ylabel('V(t)')
plt.title('$V_0$=1, Kernel normalized')
plt.legend(['[$\Lambda_0$,$\lambda_1$,$\lambda_2$]','$\\frac{1}{2}$[$\Lambda_0$,$\lambda_1$,$\lambda_2$]'])

plt.subplot(222)
plt.plot(t,V5pdeer([0.8,0.6,0.4]),'-',linewidth=2)
plt.plot(t,V5pdeer([0.4,0.3,0.2]),'--',linewidth=2)
plt.xlabel('t [us]')
plt.ylabel('V(t)')
plt.title('$V_0$=1, Kernel not normalized')


plt.subplot(223)
V0 = 2
plt.plot(t,V5pdeer_default([0.8,0.6,0.4]),'-',linewidth=2)
plt.plot(t,V0*V5pdeer_default([0.4,0.3,0.2]),'--',linewidth=2)
plt.xlabel('t [us]')
plt.ylabel('V(t)')
plt.title('$V_0$=2, Kernel normalized')

plt.subplot(224)
V0 = 2
plt.plot(t,V5pdeer([0.8,0.6,0.4]),'-',linewidth=2)
plt.plot(t,V0*V5pdeer([0.4,0.3,0.2]),'--',linewidth=2)
plt.xlabel('t [us]')
plt.ylabel('V(t)')
plt.title('$V_0$=2, Kernel not normalized')

plt.tight_layout()
plt.show()

issue

@luisfabib luisfabib added the fundamental An issue with fundamental concepts or problems label Feb 8, 2021
@luisfabib luisfabib added this to the 0.13.0 milestone Feb 8, 2021
@luisfabib luisfabib modified the milestones: 0.13.0, 0.14.0 Mar 12, 2021
@stestoll
Copy link
Collaborator

We should take V0 to indicate the echo amplitude in the absence of pump pulses (or more generally in the absence of dipolar modulation ), and Vintra and Vinter to indicate the dipolar modulation functions.

@stestoll stestoll modified the milestones: 0.14.0, 0.13.0 Mar 17, 2021
@luisfabib
Copy link
Collaborator Author

I agree with that, of course. However, that is not really the issue.

Reformulating my OP in other words: the V0 and Lam0,lam1,... parameters are structurally non-identifiable. There is no numerical way to solve this redundancy in our current formulation of the problem.

Here is proof of structural non-identifiability with a working example:

import numpy as np 
import deerlab as dl 
import matplotlib.pyplot as plt 

t = np.linspace(-0.5,5,200)
r = np.linspace(2,6,400)
P = dl.dd_gauss(r,[3,0.2])
K = dl.dipolarkernel(t,r)
S = K@P 

# Parameters
V0 = 2
Lam0 = 0.6
lam1 = 0.4 

# Model
Vexp = V0*(Lam0 + lam1*S)

# Map the objective function
Lam0_vec = np.linspace(0,5,200)
V0_vec = np.linspace(0,5,250)
obj = np.zeros((len(Lam0_vec),len(V0_vec)))
for i,Lam0_ in enumerate(Lam0_vec):
    for j,V0_ in enumerate(V0_vec):
        V = V0_*(Lam0_ + lam1*S)
        obj[i,j] = np.linalg.norm(V - Vexp)**2

# Plot
plt.figure(figsize=[4,3])
cont = plt.contourf(V0_vec,Lam0_vec,np.log10(obj),29,vmin=np.min(obj),vmax=obj.min() + 2.5)
for i in range(9):
    cont.collections[-i-1].remove()
plt.xlabel(r'$\Lambda_0$')
plt.ylabel(r'$V_0$')
plt.plot(Lam0,V0,'or')
clb = plt.colorbar()
clb.ax.set_ylabel('log$_{10}$(Obj.Fcn.)')

issue76

where you can see that the objective function surface for both Lam0 and V0 shows an infinite number of solutions.

@luisfabib
Copy link
Collaborator Author

After some discussion and further digging this issue seems to start becoming clearer. There seems to be something fishy going on with all the normalization formalities in DeerLab.

Case 1: Background and pathway amplitudes are normalized

This is the (currently implemented) case where line 209 in dipolakernel.py is set to:

B = dipolarbackground(t,pathways,B,renormalize=True,renormpaths=True)

This is the case presented in the OP above. As shown above, executing the script leads to the following figure, where the redundancy problems between V0 and Lam0 are visible.

Case 2: Background and pathway amplitudes are not normalized

This is the case where line 209 in dipolakernel.py would be set to:

B = dipolarbackground(t,pathways,B,renormalize=False,renormpaths=False)

As seen from the figure obtained executing the same script again, the redundancy seems to lift, thanks to the difference in background induced by different pathway amplitudes.

Conclusion

The origin of all this redundancy might indeed be all the (unnecessary) normalization steps in the different functions. In view of this, we should proceed to remove all these normalization segments to ensure this problem does not persist.
Also, getting rid of all these (rather) confusing options (renormalize,renormpaths,etc.) will make everything clearer.

@stestoll
Copy link
Collaborator

Indeed, the renomalize option in dipolarkernel is not consistent with the notion that the kernel should represent the dipolar modulation function Vintra. For a multi-pathway signal, it is natural to have K(t=0,r) different from 1 - meaning that there the echo at t==0 is suppressed by dipolar modulations.

So, I think the renormalize option in dipolarkernel should be removed, because it is non-physical in the context of our V0*Vintra*Vinter model.

@stestoll
Copy link
Collaborator

It looks like #99 solves some aspects of this problem.

@luisfabib
Copy link
Collaborator Author

The issues are not closed, however. After the recent changes, when trying to fit a simple 5-pulse DEER signal via the following script

import numpy as np
import matplotlib.pyplot as plt
import deerlab as dl

# %%
# Generate data
t = np.linspace(-0.1,6.5,200)    # time axis, µs
r = np.linspace(1.5,6,100)          # distance axis, nm
param0 = [3, 0.1, 0.2, 3.5, 0.1, 0.65, 3.8, 0.05, 0.15] # parameters for three-Gaussian model
P = dl.dd_gauss3(r,param0)         # model distance distribution
B = lambda t,lam: dl.bg_hom3d(t,300,lam) # background decay
exparam = [0.6, 0.3, 0.1, 3.2]     # parameters for 5pDEER experiment
pathways = dl.ex_5pdeer(exparam)   # pathways information

K = dl.dipolarkernel(t,r,pathways,B)
Vexp = K@P + dl.whitegaussnoise(t,0.005,seed=1)


ex_lb   = [ 0,   0,   0,  max(t)/2-1] # lower bounds
ex_ub   = [1, 1, 1, max(t)/2+1] # upper bounds
ex_par0 = [0.5, 0.5, 0.5, max(t)/2  ] # start values

fit = dl.fitmodel(Vexp,t,r,'P',dl.bg_hom3d,dl.ex_5pdeer,ex_par0=ex_par0,ex_lb=ex_lb,ex_ub=ex_ub,verbose=True)
fit.plot()

these are the results we obtain

----------------------------------------------------------------------------
Goodness of fit
  Vexp[0]: 𝛘2 = 21.170621  RMSD  = 9.302341e-02
----------------------------------------------------------------------------
Fitted parameters and 95%-confidence intervals
Vfit[0]:
  V0:  0.65  Signal scale (a.u.)
  bgparam[0]:   192.6661475  (40.2186352, 345.1136598)  Concentration of pumped spins (μM)
  exparam[0]:   0.9255844  (0.1946988, 1.0000000)  Amplitude of unmodulated components ()
  exparam[1]:   0.4641433  (0.0974610, 0.8308255)  Amplitude of 1st modulated pathway ()
  exparam[2]:   0.1558936  (0.0324554, 0.2793317)  Amplitude of 2nd modulated pathway ()
  exparam[3]:   3.2109122  (3.2035322, 3.2182923)  Refocusing time of 2nd modulated pathway (μs)
----------------------------------------------------------------------------

So this issue requires further inspection to assess the consequences of our recent changes.

@stestoll
Copy link
Collaborator

The absolute amplitudes fit.V0*fit.exparam[0:3] are close to their ground-truth values, so the fit itself appears to have been successful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
fundamental An issue with fundamental concepts or problems
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants