-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Write to parquet #110
Write to parquet #110
Conversation
1: "foo.nc", | ||
}, | ||
"size": {0: 100, 1: 100}, | ||
"raw": {0: None, 1: None}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does "raw"
mean in this specification?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it means "inlined" in the kerchunk library tests this ref: "a/6": b"data",
results in a non-null "raw"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @jsignell ! That was quick 😁
I'm pretty sure this is not what you had in mind, but wanted to put this up as at least an option. Even if we scrap the implementation, the test should stay the same.
Agree that's useful already! I think this implementation is useful to start with, and tests ensure we can change it later without breaking behaviour.
I was wondering if there is a more performant way to go from ManifestArray
to parquet, but that implementation will depend on how the references are stored in-memory, so should wait until #107 is merged.
I didn't include remote writing since it seemed like that would be something that we'd want to implement for all the file outputs not just for one.
👍
I just stole the easiest test off that PR for roundtripping. See what you think |
This looks good to me! Let's get it in and iterate if necessary! Thanks @jsignell ! |
Closes #72
I'm pretty sure this is not what you had in mind, but wanted to put this up as at least an option. Even if we scrap the implementation, the test should stay the same.
I didn't include remote writing since it seemed like that would be something that we'd want to implement for all the file outputs not just for one.