Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] DeerLoad import time Broken #14

Closed
laenan8466 opened this issue Aug 10, 2020 · 7 comments
Closed

[BUG] DeerLoad import time Broken #14

laenan8466 opened this issue Aug 10, 2020 · 7 comments
Labels
bug Something isn't working

Comments

@laenan8466
Copy link
Contributor

Describe the bug

DeerLoad import imports time trace as single arrays per timestep. This cannot be used in all further functions.

Environment

DeerLab Version: 0.10.0
Python Version: 3.7
OS: Windows

How to reproduce it?

Steps to reproduce the behavior:

  1. Change experimental dataset to one you know in 'file'
  2. Run script.

Expected behavior

Import as single numpy array. Further improvement would be using pandas for all data.

If I can spare the time I will look into the source lateron to propose a fix, but expect nothing.

Code Snippet

import numpy as np
import matplotlib.pyplot as plt
import deerlab as dl

file = r'dataset_401528.DSC'

[t_raw,V_raw,pars] = dl.deerload(file)

[V_real, V_imag, phase, imoffset] = dl.correctphase(V_raw,full_output=True)

tcorr = dl.correctzerotime(V_real,t_raw)

Files

I used the ring test data form Gunnar, should be same error for other experimental data.

@laenan8466 laenan8466 added the bug Something isn't working label Aug 10, 2020
@laenan8466
Copy link
Contributor Author

laenan8466 commented Aug 10, 2020

Hey,
I looked into some solutions within deerload.py. As the function is quite extensive, I would like to ask for your opinions, before fixing something.

Origin of bug

The problem occurs from the possibilities of having further axes and their dimension is then set to 1 (Line 55-68).

    # XPTS, YPTS, ZPTS specify the number of data points along x, y and z.
    if 'XPTS' in parDESC:
        nx = int(parDESC['XPTS'])
    else:
        raise ValueError('No XPTS in DSC file.')
    
    if 'YPTS' in parDESC:
        ny = int(parDESC['YPTS'])
    else:
        ny = 1
    if 'ZPTS' in parDESC:
        nz = int(parDESC['ZPTS'])
    else:
        nz = 1

Then further down (Line 155) the values are added, but bc. in Line 114 the abscissa was fixed to a width of 3 it is added as as row vectors with NaN values.

abscissa = np.full((maxlen,3),np.nan)
abscissa[:Dimensions[index],index] = np.linspace(minimum,minimum+width,npts)

Then NaN is removed, leaving nx row vectors.

# In case of column filled with NaN, erase the column in the array
abscissa = abscissa[:,~np.isnan(abscissa).all(axis=0)]

Thoughts

I have the BES3T Manual from Spring 2000 (Version 2.0). Is there a newer version, because I refer to this manual.

  • According to my knowledge the X,Y and Z axis can have a maximum dimensionality of 1.
  • Abscissa information is only with min/width/number of points in the DSC-file and has to be reconstructed from there
  • if no data is in X,Y,Z nTYP will get value NODATA.
  • For the user it would be beneficial to have one axis (X,Y and Z) in one array each. Output would then be an (numpy) array of shape:
    [[XAXIS],[YAXIS],[ZAXIS]]

Possible fixes

There are multiple ways how to fix this. What is your opinon on every of this points?

  • What should the shape of the abscissa be?
    A. As it is now, for every index a row vector
    B. One output array of shape [[XAXIS],[YAXIS],[ZAXIS]] with nAXIS = NaN if not specified in DSC.
    C. One output array of shape [[XAXIS],[YAXIS],[ZAXIS]] with nAXIS just not present if not specified in DSC.
    D. Separate arrays. This could screw up user scripts, as number of output arrays could be variable in this case
    E. Pandas datastructure

As I would love to see Pandas used in this project, I guess easiest solution would be C.

If I find the time between experiments, I will construct a solution with my favourit options, but just give me your opinion that if you have other thoughts I don't have to waste my time. 😉

@laenan8466
Copy link
Contributor Author

laenan8466 commented Aug 10, 2020

Ok, that was quicker than I thought. Please see here a quick fix. I changed nothing with your code and just added the variable abscissas and changed the output to this.

DISCLAIMER: This wasn't tested with other data than simple DEER data. Also some tests would have to be written.

If helpful I can open a pull request.

Additional error: Whereas DeerLab takes all units in microseconds, deerload will return in ns. I vote for using SI units (seconds) in all cases, but microseconds are fine as well, if consistent in all functions. Will fix this for deerload before giving a PR by checking type of axis and conversion, if you specifiy the desired unit.

@stestoll
Copy link
Collaborator

  • Most of deerlab is a transcription of EasySpin's eprload - see here. When fixing this, try to stay as close to eprload as possible.
  • I would prefer not to introduce a Pandas dependency. Let's keep dependencies as minimal as possible.
  • Any change should keep the [t_raw,V_raw,pars] = dl.deerload(file) syntax if possible.
  • The latest BES3T spec I have is v2.0 from Spring 2000.
  • Please add tests.
  • I agree it makes sense to have deerload return µs instead of ns for time axes. µs is an SI unit, s is too unwieldy.

@mtessmer
Copy link
Collaborator

mtessmer commented Aug 10, 2020

This actually may be easier to fix by using np.squeeze(). If I'm not mistaken V is also returned this way and np.squeeze will maintain any multidimensionality of the data if it exists. i.e. just change the return statement to the following:

return np.squeeze(abscissa), np.squeeze(data), parameters

@laenan8466
Copy link
Contributor Author

@mtessmer:
oh, that's a nice and neaty function I didn't knew it yet, thanks for showing. This does work when having just one axis, but when having more than one (e.g. with 2D data) this will not work and yield row arrays never the less. :-( What do you think?

@stestoll:
Yes tests would be so important... Will work on some the next days. Biggest question for me is what test dataset to use? Maybe I will use the easyspin files for a start, if no one has an idea.

Units: Yeah, microseconds are fine (and consistent with easyspin etc.). I personally prefer to use the base unit (s, T, Hz, ...) and for plots helper functions will correct for useful units.

Question to you: As it is deerload should it return an error if data doesn't include a time axis? Do you plan to port easyspin to python?

@mtessmer
Copy link
Collaborator

@laenan8466
Do you have an example file with 2D data? It may just require an array transpose.

@stestoll
Copy link
Collaborator

Regarding tests, just use one or two 1D DEER datasets, and also a 2D DEER dataset (with separately saved scans). I don't think the EasySpin test have these though.

I wouldn't error if there is not time axis, just issue a warning or so.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants