Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Event headers table should be a typed object #51

Open
Vindaar opened this issue Jun 28, 2022 · 1 comment
Open

Event headers table should be a typed object #51

Vindaar opened this issue Jun 28, 2022 · 1 comment

Comments

@Vindaar
Copy link
Owner

Vindaar commented Jun 28, 2022

Currently the event (and lesser extent run) header table that still originates from the Virtex TOS ascii data days is kept as a Table[string, string] after parsing & until writing to the H5 file in raw_data_manipulation.

This is both error prone (in the sense of typos etc.) and very inefficient, as the majority of fields (but not all, e.g. "DateTime") are actually integers.

In addition for other data sources, e.g. Timepix3, data doesn't even arrive in string format. So we have an especially unnecessary back and forth conversions to string land.

Instead we should:

  • either define a typed object EventHeader
  • or have a table of something like Table[string, EventHeaderField]

with theoretical implementation ideas like:

type
  EventHeader = 
    dateTime: DateTime
    timestamp: int
    case timepix: TimepixVersion # or TOS version field?
    of Timepix1:
      useHvFadc: bool
    ... 

or alternatively:

type
  EventHeaderFieldKind = enum
    ekString, ekBool, ekFloat, ekDateTime
  EventHeaderField = object
    case kind: EventHeaderFieldKind
    of ekString: s: string
    of ekBool: b: bool
    ...

etc.

I'd do it now, but it touches too many procedures etc. to quickly make the change.

This would also allow us to get rid of fully optional fields like useHvFadc in detectors / data sources for which it makes no sense etc.

The first solution has the advantage of being fully typed, while the second still allows for trivial extensions (i.e. adding a new scintillator field etc.).

The good thing about the second one is that at that point the code simplifies significantly as we have all type information ready in the table. So we don't have to hardcode the fields that are e.g. int or string anymore and can just check the kind of the field and generate a correct resulting H5Dataset based on that. Maybe that's the best of both worlds.

@Vindaar
Copy link
Owner Author

Vindaar commented Jun 28, 2022

In addition the timestamp information for Timepix3 data is of course much more precise than up to a single second resolution. Therefore in the current implementation we even throw away that precision for Tpx3. With a variant object in theory we could even have different timestamp types for Timepix1 and Timepix3 (but personally I'd probably just switch over from int to float for all of them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant