Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

xarray backend not reading in group data #25

Closed
JessicaS11 opened this issue Jan 11, 2024 · 2 comments
Closed

xarray backend not reading in group data #25

JessicaS11 opened this issue Jan 11, 2024 · 2 comments

Comments

@JessicaS11
Copy link
Collaborator

@rwegener2 @jpswinski Given the xarray backend is hot off the press, I wondered if either of you had test driven it yet. I tried to take it for a spin (using the version currently on main):

path_to_hdf5_file = 'nsidc-cumulus-prod-protected/ATLAS/ATL03/006/2019/11/30/ATL03_20191130112041_09860505_006_01.h5'

# reg is an icepyx query object I had authenticated and used earthaccess to grab the s3login
credentials={"aws_access_key_id":reg.s3login_credentials["accessKeyId"],
                                 "aws_secret_access_key":reg.s3login_credentials["secretAccessKey"],
                                 "aws_session_token":reg.s3login_credentials["sessionToken"], }

h5ds = xr.open_dataset(
            f'{path_to_hdf5_file}',
            group='/gt2l/heights/h_ph',
            engine="h5coro",
            backend_kwargs={"creds":credentials},
            )

Unfortunately, i just got an empty dataset.
image

Running the same info directly through an h5coro object returns the expected dictionary ({group: values}), leading me to guess the issue is somewhere in the backend piece specifically.

@jpswinski
Copy link
Member

@JessicaS11 There is definitely a lot of testing that is needed, but the code in main has been giving us pretty reliable reads using the xarray backend for the few test datasets we have been using.

Looking at your script, I think the issue might be that the group is set to a variable. We still have to add the code that allows a specific variable to read (coming soon!). So if you change it to:

group='/gt2l/heights',

it should work for you. The problem will be that it will read all of the data in all of the variables in the group... which takes a little while. But hopefully as we add the ability to specify a list of variables to read, and also slicing, and also lazy loading - things should get better.

@JessicaS11
Copy link
Collaborator Author

Thanks @jpswinski! That did the trick (specifying a group instead of a variable).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants