Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add argo functionality to QUEST #427

Merged
merged 140 commits into from
Dec 7, 2023
Merged
Show file tree
Hide file tree
Changes from 114 commits
Commits
Show all changes
140 commits
Select commit Hold shift + click to select a range
b907af5
Adding argo search and download script
kelseybisson Feb 23, 2022
fb5fc55
Create get_argo.py
kelseybisson Feb 23, 2022
715440b
begin implementing argo dataset
RomiP Feb 28, 2022
d2260d6
1st draft implementing argo dataset
RomiP Mar 8, 2022
71cfedc
implement search_data for physical argo
RomiP Apr 26, 2022
ec564fa
doctests and general cleanup for physical argo query
RomiP Jun 13, 2022
4fb5974
beginning of BGC Argo download
RomiP Jun 23, 2022
3bd2739
parse BGC profiles into DF
RomiP Jun 27, 2022
06835d5
plan to query BGC profiles
RomiP Aug 29, 2022
dac11de
validate BGC param input function
RomiP Sep 6, 2022
dd47dc5
order BGC params in order in which they should be queried
RomiP Sep 12, 2022
88722a1
fix bug in parse_into_df() - init blank df to take in union of params…
RomiP Sep 19, 2022
bf3cd70
identify profiles from initial API request containing all required pa…
RomiP Oct 3, 2022
fcb2422
creates df with only profiles that contain all user specified params
RomiP Oct 24, 2022
eb9c8ae
modified to populate prof df by querying individual profiles
RomiP Nov 21, 2022
1582f5b
finished up BGC argo download!
RomiP Nov 28, 2022
f55fd61
assert bounding box type in Argo init, begin framework for unit tests
RomiP Jan 17, 2023
d6d3872
Adding argo search and download script
kelseybisson Feb 23, 2022
7fc3b79
Create get_argo.py
kelseybisson Feb 23, 2022
195a4f1
begin implementing argo dataset
RomiP Feb 28, 2022
df34424
1st draft implementing argo dataset
RomiP Mar 8, 2022
390b7a9
implement search_data for physical argo
RomiP Apr 26, 2022
6824d27
doctests and general cleanup for physical argo query
RomiP Jun 13, 2022
58092f9
beginning of BGC Argo download
RomiP Jun 23, 2022
ae486f2
parse BGC profiles into DF
RomiP Jun 27, 2022
92f8a0d
plan to query BGC profiles
RomiP Aug 29, 2022
0285be1
validate BGC param input function
RomiP Sep 6, 2022
747af3a
order BGC params in order in which they should be queried
RomiP Sep 12, 2022
cf600c6
fix bug in parse_into_df() - init blank df to take in union of params…
RomiP Sep 19, 2022
29ee8c4
identify profiles from initial API request containing all required pa…
RomiP Oct 3, 2022
934e1a6
creates df with only profiles that contain all user specified params
RomiP Oct 24, 2022
eefcbf8
modified to populate prof df by querying individual profiles
RomiP Nov 21, 2022
55204d8
finished up BGC argo download!
RomiP Nov 28, 2022
0af53d6
assert bounding box type in Argo init, begin framework for unit tests
RomiP Jan 17, 2023
27ab9d7
need to confirm spatial extent is bbox
RomiP Feb 6, 2023
83e0e94
begin test case for available profiles
RomiP Feb 6, 2023
d23b09c
Merge remote-tracking branch 'origin/argo' into argo
RomiP Feb 6, 2023
d96c485
add tests for argo.py
JessicaS11 Feb 6, 2023
4ec53cd
add typing, add example json, and use it to test parsing
JessicaS11 Feb 13, 2023
2594153
Merge branch 'development' into argo
JessicaS11 May 22, 2023
8dfb33e
update argo to submit successful api request (update keys and values …
JessicaS11 May 22, 2023
d43da75
first pass at porting argo over to metadata+per profile download (WIP)
JessicaS11 May 30, 2023
f9c6a82
basic working argo script
JessicaS11 Jun 6, 2023
9c0de9b
simplify parameter validation (ordered list no longer needed)
JessicaS11 Jun 6, 2023
af4d8ce
add option to delete existing data before new download
JessicaS11 Jun 6, 2023
fd18b74
continue cleaning up argo.py
JessicaS11 Jun 6, 2023
df41a98
fix download_by_profile to properly store all downloaded data
JessicaS11 Jun 7, 2023
27b672b
remove old get_argo.py script
JessicaS11 Jun 7, 2023
04e392c
remove _filter_profiles function in favor of submitting data kwarg in…
JessicaS11 Jun 7, 2023
9cc0040
start filling in docstrings
JessicaS11 Jun 7, 2023
d15483b
clean up nearly duplicate functions
JessicaS11 Jun 8, 2023
d877e8b
add more docstrings
JessicaS11 Jun 8, 2023
f9b6d81
get a few minimal argo tests working
JessicaS11 Jun 12, 2023
8fcab13
add bgc argo params. begin adding merge for second download runs
JessicaS11 Jun 20, 2023
aad5053
some changes
RomiP Jun 26, 2023
8fd0083
Merge remote-tracking branch 'origin/argo' into argo
RomiP Jun 26, 2023
630415a
WIP test commit to see if can push to GH
JessicaS11 Jul 7, 2023
fe07540
WIP handling argo merge issue
JessicaS11 Jul 12, 2023
c246543
update profile to df to return df and move merging to get_dataframe
JessicaS11 Jul 20, 2023
ccb8ebf
Merge remote-tracking branch 'origin/argo' into argo
RomiP Jul 28, 2023
5851cb8
Merge remote-tracking branch 'origin/argo' into argo
RomiP Jul 28, 2023
1fd069c
merge profiles with existing df
JessicaS11 Jul 31, 2023
363dad2
clean up docstrings and code
JessicaS11 Jul 31, 2023
63d3b3b
add test_argo.py
RomiP Jul 31, 2023
a91c25f
Merge remote-tracking branch 'origin/argo' into argo
RomiP Jul 31, 2023
4602cdb
add prelim test case for adding to Argo df
RomiP Jul 31, 2023
2cdf07e
remove sandbox files
JessicaS11 Aug 14, 2023
a91c360
remove bgc argo test file
JessicaS11 Aug 14, 2023
cb367e1
update variables notebook from development
JessicaS11 Aug 14, 2023
b89840c
Merge remote-tracking branch 'origin/argo' into argo
RomiP Aug 16, 2023
381092f
simplify import statements
JessicaS11 Aug 16, 2023
283748e
quickfix for granules error
zachghiaccio Aug 18, 2023
7893307
draft subpage on available QUEST datasets
zachghiaccio Aug 18, 2023
949ffee
small reference fix in text
zachghiaccio Aug 18, 2023
7414c85
add reference to top of .rst file
zachghiaccio Aug 18, 2023
7655995
Merge remote-tracking branch 'origin/argo' into argo
RomiP Aug 19, 2023
63e1b57
test argo df merge
RomiP Aug 19, 2023
5b6c65b
Merge pull request #436 from zachghiaccio/argo
zachghiaccio Aug 25, 2023
37d19b6
Merge branch 'development' into argo
JessicaS11 Aug 28, 2023
b064224
update argo script from shared_search branch
JessicaS11 Aug 30, 2023
1d53341
update QUEST and GenQuery classes for argo integration (#441)
JessicaS11 Sep 25, 2023
51c2e83
Merge branch 'development' into argo
JessicaS11 Oct 6, 2023
9386707
fix incorrect merge conflict handling
JessicaS11 Oct 6, 2023
417929f
uncomment argo portions of Quest
JessicaS11 Oct 9, 2023
b8bf4ea
Drafted an example Jupyter notebook using both ICESat-2 and Argo thro…
zachghiaccio Oct 10, 2023
5ff8c83
combine multiple argo test files
JessicaS11 Oct 11, 2023
251bf0b
Removed redundant cells and code.
zachghiaccio Oct 18, 2023
d03f9fb
temporarily disable OpenAltimetry API tests (#459)
JessicaS11 Oct 18, 2023
ee8b79f
fix spot number calculation (#458)
JessicaS11 Oct 18, 2023
a1a723d
Fix a broken link in IS2_data_access.ipynb (#456)
whyjz Oct 18, 2023
d86cc9e
update Read input arguments (#444)
rwegener2 Oct 18, 2023
aedbcce
enable QUEST kwarg handling (#452)
JessicaS11 Oct 19, 2023
73f929e
docs: add rwegener2 as a contributor for bug, code, and 6 more (#460)
allcontributors[bot] Oct 26, 2023
a56a9c8
docs: add jpswinski as a contributor for review (#461)
allcontributors[bot] Oct 26, 2023
bdcc9bd
docs: add whyjz as a contributor for tutorial (#462)
allcontributors[bot] Oct 27, 2023
f514619
Link to QUEST dataset page went missing again. Fixed.
zachghiaccio Oct 31, 2023
53831aa
Fixed typo in reference.
zachghiaccio Oct 31, 2023
a0c5acd
Moved Argo workflow to Examples folder.
zachghiaccio Oct 31, 2023
a326949
Add link to Argo workbook.
zachghiaccio Oct 31, 2023
38e9b5d
test argo script via a quest instance
JessicaS11 Nov 1, 2023
992e7ae
clean up argo script
JessicaS11 Nov 1, 2023
ea0d37e
clean up quest tests
JessicaS11 Nov 1, 2023
cce8fa7
remove =None from required inputs
JessicaS11 Nov 1, 2023
efcb16d
remove note from argo
JessicaS11 Nov 1, 2023
fb90b0c
add newest icepyx citations (#455)
JessicaS11 Nov 2, 2023
af5d8b0
Merge branch 'development' into argo
JessicaS11 Nov 3, 2023
ce29a55
remove unused test data file
JessicaS11 Nov 3, 2023
9857978
Update doc/source/contributing/quest-available-datasets.rst
zachghiaccio Nov 3, 2023
8ed4c7b
fix formatting via linter
JessicaS11 Nov 3, 2023
b99855e
remove examples suggesting an Argo object can be created directly
JessicaS11 Nov 3, 2023
9aaca81
Clarified that the new dataset guidelines are a work in progress.
zachghiaccio Nov 4, 2023
f6eb1b2
Cleanup of text in QUEST notebook.
zachghiaccio Nov 4, 2023
7e004f0
GitHub action UML generation auto-update
zachghiaccio Nov 6, 2023
e8f8c38
fix failing argo tests
JessicaS11 Nov 6, 2023
d5747fa
Variables as an independent class (#451)
rwegener2 Nov 7, 2023
6dac12c
Merge branch 'development' into argo
JessicaS11 Nov 8, 2023
eea686c
Merge branch 'development' into argo
JessicaS11 Nov 17, 2023
d596c01
use factories as fixture test pattern
JessicaS11 Nov 6, 2023
d7b9424
add quest module init file
JessicaS11 Nov 8, 2023
3fb4c48
streamline params and presRange handling, including docs+tests+ex
JessicaS11 Nov 17, 2023
283fc04
fix failing test due to list order
JessicaS11 Nov 21, 2023
d888127
Addressed Jessica's suggestions for QUEST notebook.
zachghiaccio Nov 27, 2023
afc7edd
remove duplicate QUEST page in docs
JessicaS11 Nov 27, 2023
ab40328
add some text to QUEST example notebook
JessicaS11 Nov 27, 2023
2a9efe4
Merge branch 'development' into argo
JessicaS11 Nov 27, 2023
24de1bc
limit example notebook to one IS2 granule
JessicaS11 Nov 28, 2023
36cac57
fix indentation error and make params protected
RomiP Dec 4, 2023
74c3a60
skips when error downloading argo profile, save df to csv
RomiP Dec 4, 2023
79ecc27
use xarray to drop info before converting to df
JessicaS11 Dec 4, 2023
089f71e
implement save function in argo
RomiP Dec 6, 2023
377d3c0
Merge remote-tracking branch 'origin/argo' into argo
RomiP Dec 6, 2023
40a4ca3
run black formatter on all files
JessicaS11 Dec 6, 2023
2fdcdd2
Revert "run black formatter on all files"
JessicaS11 Dec 6, 2023
cc8b1d0
run black formatter on files in this PR
JessicaS11 Dec 6, 2023
d467a84
Update doc/source/contributing/quest-available-datasets.rst
JessicaS11 Dec 6, 2023
fc9e086
add docstring for new save_all fn
JessicaS11 Dec 6, 2023
54a14ea
remove savename kwarg
JessicaS11 Dec 6, 2023
ae1d484
standardize docstring spaces in argo.py
JessicaS11 Dec 6, 2023
468f3ba
last updates to notebook
JessicaS11 Dec 6, 2023
ff3b4b6
GitHub action UML generation auto-update
kelseybisson Dec 6, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
31 changes: 29 additions & 2 deletions .all-contributorsrc
Original file line number Diff line number Diff line change
Expand Up @@ -382,7 +382,8 @@
"avatar_url": "https://avatars.githubusercontent.com/u/54070345?v=4",
"profile": "https://github.com/jpswinski",
"contributions": [
"code"
"code",
"review"
]
},
{
Expand Down Expand Up @@ -422,6 +423,31 @@
"contributions": [
"review"
]
},
{
"login": "rwegener2",
"name": "Rachel Wegener",
"avatar_url": "https://avatars.githubusercontent.com/u/35503632?v=4",
"profile": "https://rwegener2.github.io/",
"contributions": [
"bug",
"code",
"doc",
"ideas",
"maintenance",
"review",
"test",
"tutorial"
]
},
{
"login": "whyjz",
"name": "Whyjay Zheng",
"avatar_url": "https://avatars.githubusercontent.com/u/19339926?v=4",
"profile": "https://whyjz.github.io/",
"contributions": [
"tutorial"
]
}
],
"contributorsPerLine": 7,
Expand All @@ -430,5 +456,6 @@
"repoType": "github",
"repoHost": "https://github.com",
"skipCi": true,
"commitConvention": "angular"
"commitConvention": "angular",
"commitType": "docs"
}
18 changes: 10 additions & 8 deletions CONTRIBUTORS.rst

Large diffs are not rendered by default.

29 changes: 29 additions & 0 deletions doc/source/contributing/quest-available-datasets.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
.. _quest_supported_label:

QUEST Supported Datasets
========================

On this page, we outline the datasets that are supported by the QUEST module. Click on the links for each dataset to view information about the API and sensor/data platform used.


List of Datasets
----------------

`Argo <https://argo.ucsd.edu/data/>`_
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
The Argo mission involves a series of floats that are designed to capture vertical ocean profiles of temperature, salinity, and pressure down to ~2000 m. Some floats are in support of BGC-Argo, which also includes data relevant for biogeochemical applications: oxygen, nitrate, chlorophyll, backscatter, and solar irradiance.

A paper outlining the Argo extension to QUEST is currently in preparation, with a citable preprint available in the near future.

:ref:`Argo Workflow Example<quest_workbook_label>`


Adding a Dataset to QUEST
-------------------------

Want to add a new dataset to QUEST? No problem! QUEST includes a template script (``dataset.py``) that may be used to create your own querying module for a dataset of interest.

Once you have developed a script with the template, you may request for the module to be added to QUEST via Github. Please see the How to Contribute page :ref:`dev_guide_label` for instructions on how to contribute to icepyx.
JessicaS11 marked this conversation as resolved.
Show resolved Hide resolved

Detailed guidelines on how to construct your dataset module are currently a work in progress.

2 changes: 1 addition & 1 deletion doc/source/example_notebooks/IS2_data_access.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -79,7 +79,7 @@
"\n",
"There are three required inputs, depending on how you want to search for data. Two are required in all cases:\n",
"- `short_name` = the data product of interest, known as its \"short name\".\n",
"See https://nsidc.org/data/icesat-2/data-sets for a list of the available data products.\n",
"See https://nsidc.org/data/icesat-2/products for a list of the available data products.\n",
"- `spatial extent` = a region of interest to search within. This can be entered as a bounding box, polygon vertex coordinate pairs, or a polygon geospatial file (currently shp, kml, and gpkg are supported).\n",
" - bounding box: Given in decimal degrees for the lower left longitude, lower left latitude, upper right longitude, and upper right latitude\n",
" - polygon vertices: Given as longitude, latitude coordinate pairs of decimal degrees with the last entry a repeat of the first.\n",
Expand Down
182 changes: 129 additions & 53 deletions doc/source/example_notebooks/IS2_data_read-in.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -63,9 +63,8 @@
"metadata": {},
"outputs": [],
"source": [
"path_root = '/full/path/to/your/data/'\n",
"pattern = \"processed_ATL{product:2}_{datetime:%Y%m%d%H%M%S}_{rgt:4}{cycle:2}{orbitsegment:2}_{version:3}_{revision:2}.h5\"\n",
"reader = ipx.Read(path_root, \"ATL06\", pattern) # or ipx.Read(filepath, \"ATLXX\") if your filenames match the default pattern"
"path_root = '/full/path/to/your/ATL06_data/'\n",
"reader = ipx.Read(path_root)"
]
},
{
Expand Down Expand Up @@ -111,10 +110,9 @@
"\n",
"Reading in ICESat-2 data with icepyx happens in a few simple steps:\n",
"1. Let icepyx know where to find your data (this might be local files or urls to data in cloud storage)\n",
"2. Tell icepyx how to interpret the filename format\n",
"3. Create an icepyx `Read` object\n",
"4. Make a list of the variables you want to read in (does not apply for gridded products)\n",
"5. Load your data into memory (or read it in lazily, if you're using Dask)\n",
"2. Create an icepyx `Read` object\n",
"3. Make a list of the variables you want to read in (does not apply for gridded products)\n",
"4. Load your data into memory (or read it in lazily, if you're using Dask)\n",
"\n",
"We go through each of these steps in more detail in this notebook."
]
Expand Down Expand Up @@ -168,21 +166,18 @@
{
"cell_type": "markdown",
"id": "e8da42c1",
"metadata": {},
"metadata": {
"user_expressions": []
},
"source": [
"### Step 1: Set data source path\n",
"\n",
"Provide a full path to the data to be read in (i.e. opened).\n",
"Currently accepted inputs are:\n",
"* a directory\n",
"* a single file\n",
"\n",
"All files to be read in *must* have a consistent filename pattern.\n",
"If a directory is supplied as the data source, all files in any subdirectories that match the filename pattern will be included.\n",
"\n",
"S3 bucket data access is currently under development, and requires you are registered with NSIDC as a beta tester for cloud-based ICESat-2 data.\n",
"icepyx is working to ensure a smooth transition to working with remote files.\n",
"We'd love your help exploring and testing these features as they become available!"
"* a string path to directory - all files from the directory will be opened\n",
"* a string path to single file - one file will be opened\n",
"* a list of filepaths - all files in the list will be opened\n",
"* a glob string (see [glob](https://docs.python.org/3/library/glob.html)) - any files matching the glob pattern will be opened"
]
},
{
Expand All @@ -208,86 +203,147 @@
{
"cell_type": "code",
"execution_count": null,
"id": "e683ebf7",
"id": "fac636c2-e0eb-4e08-adaa-8f47623e46a1",
"metadata": {},
"outputs": [],
"source": [
"# urlpath = 's3://nsidc-cumulus-prod-protected/ATLAS/ATL03/004/2019/11/30/ATL03_20191130221008_09930503_004_01.h5'"
"# list_of_files = ['/my/data/ATL06/processed_ATL06_20190226005526_09100205_006_02.h5', \n",
"# '/my/other/data/ATL06/processed_ATL06_20191202102922_10160505_006_01.h5']"
]
},
{
"cell_type": "markdown",
"id": "92743496",
"id": "ba3ebeb0-3091-4712-b0f7-559ddb95ca5a",
"metadata": {
"user_expressions": []
},
"source": [
"### Step 2: Create a filename pattern for your data files\n",
"#### Glob Strings\n",
"\n",
"[glob](https://docs.python.org/3/library/glob.html) is a Python library which allows users to list files in their file systems whose paths match a given pattern. Icepyx uses the glob library to give users greater flexibility over their input file lists.\n",
"\n",
"glob works using `*` and `?` as wildcard characters, where `*` matches any number of characters and `?` matches a single character. For example:\n",
"\n",
"Files provided by NSIDC typically match the format `\"ATL{product:2}_{datetime:%Y%m%d%H%M%S}_{rgt:4}{cycle:2}{orbitsegment:2}_{version:3}_{revision:2}.h5\"` where the parameters in curly brackets indicate a parameter name (left of the colon) and character length or format (right of the colon).\n",
"Some of this information is used during data opening to help correctly read and label the data within the data structure, particularly when multiple files are opened simultaneously.\n",
"* `/this/path/*.h5`: refers to all `.h5` files in the `/this/path` folder (Example matches: \"/this/path/processed_ATL03_20191130221008_09930503_006_01.h5\" or \"/this/path/myfavoriteicsat-2file.h5\")\n",
"* `/this/path/*ATL07*.h5`: refers to all `.h5` files in the `/this/path` folder that have ATL07 in the filename. (Example matches: \"/this/path/ATL07-02_20221012220720_03391701_005_01.h5\" or \"/this/path/processed_ATL07.h5\")\n",
"* `/this/path/ATL??/*.h5`: refers to all `.h5` files that are in a subfolder of `/this/path` and a subdirectory of `ATL` followed by any 2 characters (Example matches: \"/this/path/ATL03/processed_ATL03_20191130221008_09930503_006_01.h5\", \"/this/path/ATL06/myfile.h5\")\n",
"\n",
"By default, icepyx will assume your filenames follow the default format.\n",
"However, you can easily read in other ICESat-2 data files by supplying your own filename pattern.\n",
"For instance, `pattern=\"ATL{product:2}-{datetime:%Y%m%d%H%M%S}-Sample.h5\"`. A few example patterns are provided below."
"See the glob documentation or other online explainer tutorials for more in depth explanation, or advanced glob paths such as character classes and ranges."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "7318abd0",
"metadata": {},
"outputs": [],
"cell_type": "markdown",
"id": "20286c76-5632-4420-b2c9-a5a6b1952672",
"metadata": {
"user_expressions": []
},
"source": [
"#### Recursive Directory Search"
]
},
{
"cell_type": "markdown",
"id": "632bd1ce-2397-4707-a63f-9d5d2fc02fbc",
"metadata": {
"user_expressions": []
},
"source": [
"glob will not by default search all of the subdirectories for matching filepaths, but it has the ability to do so.\n",
"\n",
"If you would like to search recursively, you can achieve this by either:\n",
"1. passing the `recursive` argument into `glob_kwargs` and including `\\**\\` in your filepath\n",
"2. using glob directly to create a list of filepaths\n",
"\n",
"Each of these two methods are shown below."
]
},
{
"cell_type": "markdown",
"id": "da0cacd8-9ddc-4c31-86b6-167d850b989e",
"metadata": {
"user_expressions": []
},
"source": [
"# pattern = 'ATL06-{datetime:%Y%m%d%H%M%S}-Sample.h5'\n",
"# pattern = 'ATL{product:2}-{datetime:%Y%m%d%H%M%S}-Sample.h5'"
"Method 1: passing the `recursive` argument into `glob_kwargs`"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "f43e8664",
"id": "e276b876-9ec7-4991-8520-05c97824b896",
"metadata": {},
"outputs": [],
"source": [
"# pattern = \"ATL{product:2}_{datetime:%Y%m%d%H%M%S}_{rgt:4}{cycle:2}{orbitsegment:2}_{version:3}_{revision:2}.h5\""
"ipx.Read('/path/to/**/folder', glob_kwargs={'recursive': True})"
]
},
{
"cell_type": "markdown",
"id": "f5a1e85e-fc4a-405f-9710-0cb61b827f2c",
"metadata": {
"user_expressions": []
},
"source": [
"You can use `glob_kwargs` for any additional argument to Python's builtin `glob.glob` that you would like to pass in via icepyx."
]
},
{
"cell_type": "markdown",
"id": "76de9539-710c-49f6-9e9e-238849382c33",
"metadata": {
"user_expressions": []
},
"source": [
"Method 2: using glob directly to create a list of filepaths"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "992a77fb",
"id": "be79b0dd-efcf-4d50-bdb0-8e3ae8e8e38c",
"metadata": {},
"outputs": [],
"source": [
"# grid_pattern = \"ATL{product:2}_GL_0311_{res:3}m_{version:3}_{revision:2}.nc\""
"import glob"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "6aec1a70",
"metadata": {},
"id": "5d088571-496d-479a-9fb7-833ed7e98676",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"pattern = \"processed_ATL{product:2}_{datetime:%Y%m%d%H%M%S}_{rgt:4}{cycle:2}{orbitsegment:2}_{version:3}_{revision:2}.h5\""
"list_of_files = glob.glob('/path/to/**/folder', recursive=True)\n",
"ipx.Read(list_of_files)"
]
},
{
"cell_type": "markdown",
"id": "4275b04c",
"id": "08df2874-7c54-4670-8f37-9135ea296ff5",
"metadata": {
"user_expressions": []
},
"source": [
"### Step 3: Create an icepyx read object\n",
"```{admonition} Read Module Update\n",
"Previously, icepyx required two additional conditions: 1) a `product` argument and 2) that your files either matched the default `filename_pattern` or that the user provided their own `filename_pattern`. These two requirements have been removed. `product` is now read directly from the file metadata (the root group's `short_name` attribute). Flexibility to specify multiple files via the `filename_pattern` has been replaced with the [glob string](https://docs.python.org/3/library/glob.html) feature, and by allowing a list of filepaths as an argument.\n",
"\n",
"The `Read` object has two required inputs:\n",
"- `path` = a string with the full file path or full directory path to your hdf5 (.h5) format files.\n",
"- `product` = the data product you're working with, also known as the \"short name\".\n",
"The `product` and `filename_pattern` arguments have been maintained for backwards compatibility, but will be fully removed in icepyx version 1.0.0.\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "4275b04c",
"metadata": {
"user_expressions": []
},
"source": [
"### Step 2: Create an icepyx read object\n",
"\n",
"The `Read` object also accepts the optional keyword input:\n",
"- `pattern` = a formatted string indicating the filename pattern required for Intake's path_as_pattern argument."
"Using the `data_source` described in Step 1, we can create our Read object."
]
},
{
Expand All @@ -299,7 +355,17 @@
},
"outputs": [],
"source": [
"reader = ipx.Read(data_source=path_root, product=\"ATL06\", filename_pattern=pattern) # or ipx.Read(filepath, \"ATLXX\") if your filenames match the default pattern"
"reader = ipx.Read(data_source=path_root)"
]
},
{
"cell_type": "markdown",
"id": "7b2acfdb-75eb-4c64-b583-2ab19326aaee",
"metadata": {
"user_expressions": []
},
"source": [
"The Read object now contains the list of matching files that will eventually be loaded into Python. You can inspect its properties, such as the files that were located or the identified product, directly on the Read object."
]
},
{
Expand All @@ -309,7 +375,17 @@
"metadata": {},
"outputs": [],
"source": [
"reader._filelist"
"reader.filelist"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "7455ee3f-f9ab-486e-b4c7-2fa2314d4084",
"metadata": {},
"outputs": [],
"source": [
"reader.product"
]
},
{
Expand All @@ -319,7 +395,7 @@
"user_expressions": []
},
"source": [
"### Step 4: Specify variables to be read in\n",
"### Step 3: Specify variables to be read in\n",
"\n",
"To load your data into memory or prepare it for analysis, icepyx needs to know which variables you'd like to read in.\n",
"If you've used icepyx to download data from NSIDC with variable subsetting (which is the default), then you may already be familiar with the icepyx `Variables` module and how to create and modify lists of variables.\n",
Expand Down Expand Up @@ -426,7 +502,7 @@
"user_expressions": []
},
"source": [
"### Step 5: Loading your data\n",
"### Step 4: Loading your data\n",
"\n",
"Now that you've set up all the options, you're ready to read your ICESat-2 data into memory!"
]
Expand Down Expand Up @@ -541,9 +617,9 @@
],
"metadata": {
"kernelspec": {
"display_name": "general",
"display_name": "icepyx-dev",
"language": "python",
"name": "general"
"name": "icepyx-dev"
},
"language_info": {
"codemirror_mode": {
Expand Down
Loading