-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Test fsspec roundtrip #42
Merged
Merged
Changes from all commits
Commits
Show all changes
22 commits
Select commit
Hold shift + click to select a range
c1c3919
move kerchunk backend imports to be specific to each backend filetype
TomNicholas a26902d
test roundtrip to json file then reading using fsspec
TomNicholas 305b850
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] db49d03
add test env dependencies
TomNicholas 97e7806
more test env deps
TomNicholas 0963fd3
more
TomNicholas 1be6e61
add pip install of xarray PR
TomNicholas 43f5870
correct pip url
TomNicholas d086127
roundtrip test involving concatenation
TomNicholas 91776a9
Merge branch 'main' into test_fsspec_roundtrip
TomNicholas 3f75850
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] f2aad9f
Merge branch 'main' into test_fsspec_roundtrip
TomNicholas ad94bf4
Merge branch 'test_fsspec_roundtrip' of https://github.com/TomNichola…
TomNicholas aaa371c
Merge branch 'main' into test_fsspec_roundtrip
TomNicholas 41f0a2b
Merge branch 'main' into test_fsspec_roundtrip
TomNicholas c4127c5
Merge branch 'main' into test_fsspec_roundtrip
TomNicholas c2a20ce
Merge branch 'test_fsspec_roundtrip' of https://github.com/TomNichola…
TomNicholas 0460888
Merge branch 'main' into test_fsspec_roundtrip
TomNicholas 6e8c372
remove duplication of pooch
TomNicholas 89a7018
correct formatting
TomNicholas e527f71
Merge branch 'main' into test_fsspec_roundtrip
TomNicholas 5dbdfcd
try removing netcdf4-python from the environment
TomNicholas File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the hard case, the two time variables will be encoded differently, and the concat makes no sense any more.
IIRC kerchunk supports this by decoding time and inlining bytes in the JSON. I think you'll want to at least raise an error here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why will they be encoded differently?
In what case should I be raising an error? When the encoding attributes are different?
(Sorry I'm not very familiar with the encoding step in general)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The encoding chooses units separately for each dataset since they are written separately.
This happens in the wild, and you'll want to catch it.
I think so. How do you handle someone trying to concat different files with different encodings? Are the encodings simply discarded and you use those from the first file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay thanks.
At the moment I'm not doing anything with encodings explicitly, and my tests have been testing with
ManifestArrays
created in-memory. So I'm not actually deliberately using them anywhere...There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PR #69 added the ability to load selected variables in as normal lazy numpy arrays, so they will get decoded by xarray automatically.
EDIT: But we can't use it to make this roundtripping test work until I implement writing those numpy arrays out "inlined" into the kerchunk reference files, see #62.
I also opened issue #68 to discuss the encoding problem more generally.