-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add LGDO format conversion utilities #4
Comments
I propose then to deprecate store.read_object("obj", "file.lh5").convert(fmt="numpy.ndarray") Which would return the same. This new |
I am confused about how the return type annotation would work in this case. Can you have a single function with multiple types of output depending on the input parameters? |
Yes, it would look like this: def convert(...) -> pandas.DataFrame | numpy.NDArray | ...:
pass |
Over the last few days, I have been playing around with implementing this feature. For the most part, it is straightforward, although a few questions arose: VectorOfVectors:
Struct/Table:
WaveformTable/encoded data:
ToDos:
|
I think it would be easier to see in code. I will prepare a PR with the status as it is at the moment. |
…utilities The idea is to add a `convert` function to each LGDO datatype that converts the underlying data to a third-party datatype. These are `pandas.DataFrame`, `numpy.ndarray` and `awkward.Array`. Additionally, you have the option to control whether `convert` copies data or not. At the moment, these issues are still open: [ ] How to use `to_aoesa` to convert VectorOfVectors to `numpy.ndarray`? [ ] How to implement the conversion of structures/tables to `numpy.ndarray`? [ ] How to implement the `convert' function for WaveformTable and encoded data? [ ] Find out how to implement units with pint. Is it possible for awkward arrays? [ ] Write many, many tests.
We should implement a method for each LGDO to convert underlying data to third-party formats like NumPy, Pandas, AwkwardArray. I'm thinking about something like:
Where
fmt
could takepandas.DataFrame
,numpy.ndarray
,awkward.Array
.This way, we would store the conversion code along with the LGDO implementation and make it easier to jump between data representations (like in
load_nda()
,load_pd()
,build_tcm()
, theDataLoader
, etc).We need of course to make a distinction between copy and zero-copy conversions.
The text was updated successfully, but these errors were encountered: