Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Variables as an independent class #451

Merged
merged 65 commits into from
Nov 7, 2023
Merged

Variables as an independent class #451

merged 65 commits into from
Nov 7, 2023

Conversation

rwegener2
Copy link
Contributor

@rwegener2 rwegener2 commented Oct 3, 2023

Goal

The ability to list the available variables and their h5 paths is a unique and powerful feature of icepyx. This PR aims to make that first-order functionality, instead of functionality that is only used internally. I think this will be a useful user feature, and I think it makes sense to make this change before adding the s3 url capability.

How it was done

We anticipate that the most common use case is reading variables from a file. Alternately, all you need to produce an .avail() list of variables is a product and version. This information is used to ping an API, which returns the variables.

When using Variables() the user is required to supply EITHER:

  1. A local filepath (variables are read directly from the file)
  2. the product

The version argument is optional. If it is not supplied the latest version is assumed. If both path and product is given an error is raised.

How it can be tested (updated)

from icepyx.core.variables import Variables

# initializing with a filepath
v = Variables('data/ATL06/processed_ATL06_20190226005526_09100205_006_02.h5')
v.avail()

# initializing with a product only
v = Variables(product='ATL03')
v.avail()
print(v.version)  # returns '006'

# initializing with a product and version
v = Variables(product='ATL03', version='004')
v.avail()
print(v.version)  # returns '004'

And testing that Read and Query still work:

from icepyx.core.read import Read
ds = Read('data/ATL06/processed_ATL06_20190226005526_09100205_006_02.h5')
ds.vars.append(beam_list=['gt1l'], var_list=['h_li', "latitude", "longitude"])
ds.load()
from icepyx.core.query import Query
from icepyx.core.read import Read

# Create Query object
short_name = 'ATL06'
spatial_extent = [-55, 68, -48, 71]
date_range = ['2019-02-20','2019-02-28']
region = Query(short_name, spatial_extent, date_range)

# order and download granules
region.order_vars.append(beam_list=['gt1l'], var_list=['h_li', "latitude", "longitude"])
region.order_granules(Coverage=region.order_vars.wanted)
path = './download'
region.download_granules(path)

# open the files to see they have the required variables included (ex. sc_orient, atlas_sdp_gps_epoch)
reader = Read(path + '/processed_ATL06_20190222010344_08490205_006_02.h5')
reader.vars.append(beam_list=['gt1l'], var_list=['h_li', "latitude", "longitude"])
reader.load()

Notes

  • One function the Variables class was playing was also to enforce that the user requests certain variables. This is done both to help the user know what to want and also to help icepyx with future merges. I'm noting here that as this PR also represents a design change in which Variables no longer enforces that, Read and Query do. We implemented this by:
  1. Read: The required var_list is appended to the wanted variables list (vars.wanted) during load. If vars.wanted does not exist an error is raised instructing the user to add variables before loading the dataset.
  2. Query: We want the required variables to be appended only if the user has requested a subsetted list of variables. The user does this by passing the Coverage kwarg into order_granules (Commonly: region_a.order_granules(Coverage=region_a.order_vars.wanted)) Required variables will get appended to Coverage in the subsetparams method of Query.

Open questions (Now resolved)

  1. Do we think it is important to maintain backwards compatibility given that this is a class that has, up until now, been for internal use only? --> We will raise DeprecationErrors for using old arguments, but won't fully support old functionality
  2. What are the most user-friendly input arguments? (See final decision described above)
  3. One aspect of reading that has now changed is that a user is allowed to read a data without appending any variables. This previously caused an error (UnboundLocalError: local variable 'groups_list' referenced before assignment). Now the code loads a dataset object that is basically empty. Do we want this? --> We will raise a new descriptive error if the user has not added any variables before reading

icepyx/core/read.py Outdated Show resolved Hide resolved
icepyx/core/query.py Outdated Show resolved Hide resolved
Co-authored-by: Jessica Scheick <[email protected]>
icepyx/core/query.py Outdated Show resolved Hide resolved
Copy link
Member

@JessicaS11 JessicaS11 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for all the tremendous work on this, @rwegener2!

@rwegener2 rwegener2 merged commit d5747fa into development Nov 7, 2023
3 checks passed
@rwegener2 rwegener2 deleted the indep_vars branch November 7, 2023 12:17
JessicaS11 pushed a commit that referenced this pull request Nov 15, 2023
Refactor Variables class to be user facing functionality
JessicaS11 pushed a commit that referenced this pull request Jan 5, 2024
Refactor Variables class to be user facing functionality
JessicaS11 added a commit that referenced this pull request Jan 5, 2024
@JessicaS11 JessicaS11 linked an issue Jan 24, 2024 that may be closed by this pull request
2 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Make Variables an independent class
2 participants