-
Notifications
You must be signed in to change notification settings - Fork 107
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Expand icepyx to read s3 data #468
Conversation
Co-authored-by: Wei Ji <[email protected]>
Co-authored-by: Jessica Scheick <[email protected]>
Alright this PR is ready for re-review. Changes since the last review:
|
if self.is_s3 is True and len(self._filelist) > 2: | ||
warnings.warn( | ||
"Processing more than two s3 files can take a prohibitively long time. " | ||
"Approximate access time (using `.load()`) can exceed 6 minutes per data " | ||
"variable.", | ||
stacklevel=2, | ||
) | ||
_confirm_proceed() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rwegener2 I wanted to trigger the _confirm_proceed()
so I passed in a list of s3 urls... I got the confirmation, but the warning never showed (and the confirmation is confusing without it). Wondering if it makes sense to kick the warning into the function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, interesting. The warning appeared when I was running it. That was a few commits back, though, so 🤷♀️
I think it's fine to add the warning to the _confirm_proceed()
function. My guess is that either way we'll need to increase the stacklevel
to 3 or 4 if the warning isn't showing up at level 2.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I seem to recall I was messing with the warnings settings in my hub for a presentation recently... since this is an important one, I added a cell to turn warnings on (changing the stack level didn't help in my case). Let me know if you think we should take a different approach. (EDIT: I left the warning where it was since the message depends on what triggers the confirmation.) Otherwise, this should be good to go.
Co-authored-by: Jessica Scheick <[email protected]>
This reverts commit 690ef17.
Goal
NSIDC has put IS-2 data in the cloud, so we would like icepyx to be able to access that data for users. ☁️💙
How it was done
The
EarthdataAuth
class was added as a mixin to theRead
class. The product and version are read from the file the same way they are read in using a local file. The python standard glob module does not accept s3 paths, so s3 paths are not glob-able (i.e. the user must always give the full exact filepath). There is an s3 version of glob, so this could be implemented in the future. Mixed lists of s3 paths and local paths are not allowed. If a user tries to load more than two files or three data variables at the same time they are given a warning that discourages continuing and requires user confirmation to proceed.Very loose timing suggests that it takes about 5 minutes per data variable per file to load s3 data (testing on the
dem_h
variable of anATL09
file). Running two files took 11 minutes.How to test it