Skip to content

Commit

Permalink
Addressed Jessica's suggestions for QUEST notebook.
Browse files Browse the repository at this point in the history
  • Loading branch information
zachghiaccio committed Nov 27, 2023
1 parent 283fc04 commit d888127
Showing 1 changed file with 174 additions and 65 deletions.
239 changes: 174 additions & 65 deletions doc/source/example_notebooks/QUEST_argo_data_access.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -36,18 +36,6 @@
"import icepyx as ipx"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "41bb9895",
"metadata": {},
"outputs": [],
"source": [
"%load_ext autoreload\n",
"import icepyx as ipx\n",
"%autoreload 2\n"
]
},
{
"cell_type": "markdown",
"id": "5c35f5df-b4fb-4a36-8d6f-d20f1552767a",
Expand Down Expand Up @@ -184,35 +172,35 @@
},
{
"cell_type": "markdown",
"id": "62afb9ad",
"metadata": {},
"id": "7bade19e-5939-410a-ad54-363636289082",
"metadata": {
"user_expressions": []
},
"source": [
"**ZACH**\n",
"\n",
"Could you add a little bit of text around argo parameters/presRange and the ability to search and download multiple times (outside the quest `search_all` and `download_all` options)? A few highlights that come to mind after recent updates:\n",
"- by default only temperature is gotten, but you can supply a list of the parameters you want to `reg_a.add_argo()`\n",
"- you can also directly, at any time, view or update the `reg_a.datasets['argo'].params` value, which will then be used in your next search or download\n",
"- alternatively, you can directly search/download via `reg_a.datasets['argo'].search_data()` and provide `params` or `presRange` keyword arguments that will replace the existing values of `reg_a.datasets['argo'].params`/`reg_a.datasets['argo'].presRange`\n",
"- when downloading, you can also provide the `keep_existing=True` kwarg to add more profiles, parameters, pressure ranges to your existing dataframe (and have them merged nicely for you)"
"When accessing Argo data, the variables of interest will be organized as vertical profiles as a function of pressure. By default, only temperature is queried, but the user can supply a list of desired parameters using the code below."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "dc921ca5",
"metadata": {},
"id": "6739c3aa-1a88-4d8e-9fd8-479528c20e97",
"metadata": {
"tags": []
},
"outputs": [],
"source": []
"source": [
"# Customized variable query\n",
"reg_a.add_argo(params=['temperature'])"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "435a1243",
"metadata": {},
"outputs": [],
"cell_type": "markdown",
"id": "2d06436c-2271-4229-8196-9f5180975ab1",
"metadata": {
"user_expressions": []
},
"source": [
"# see what argo parameters will be searched for or downloaded\n",
"reg_a.datasets['argo'].params"
"Additionally, a user may view or update the list of Argo parameters at any time through `reg_a.datasets['argo'].params`. If a user submits an invalid parameter (\"temp\" instead of \"temperature\", for example), an `AssertionError` will be passed."
]
},
{
Expand All @@ -223,29 +211,49 @@
"outputs": [],
"source": [
"# update the list of argo parameters\n",
"reg_a.datasets['argo'].params = ['temperature','salinity']\n",
"\n",
"# if you submit an invalid parameter (such as 'temp' instead of 'temperature') you'll get an \n",
"# AssertionError and message saying the parameter is invalid (example: reg_a.datasets['argo'].params = ['temp','salinity'])"
"reg_a.datasets['argo'].params = ['temperature','salinity']"
]
},
{
"cell_type": "markdown",
"id": "453900c1-cd62-40c9-820c-0615f63f17f5",
"metadata": {
"user_expressions": []
},
"source": [
"Another approach to directly search or download Argo data is to use `reg_a.datasets['argo'].search_data()`, and `reg_a.datasets['argo'].download()` as long as specific parameters and pressure ranges are given to `params` and `presRange`, respectively."
]
},
{
"cell_type": "markdown",
"id": "3f55be4e-d261-49c1-ac14-e19d8e0ff828",
"metadata": {
"user_expressions": []
},
"source": [
"With our current setup, let's see what Argo parameters we will get."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "c15675df",
"id": "435a1243",
"metadata": {},
"outputs": [],
"source": [
"reg_a.datasets['argo'].search_data()"
"# see what argo parameters will be searched for or downloaded\n",
"reg_a.datasets['argo'].params"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "db56cc33",
"id": "c15675df",
"metadata": {},
"outputs": [],
"source": []
"source": [
"reg_a.datasets['argo'].search_data()"
]
},
{
"cell_type": "markdown",
Expand All @@ -271,23 +279,109 @@
"path = '/icepyx/quest/downloaded-data/'\n",
"\n",
"# Access Argo and ICESat-2 data simultaneously\n",
"reg_a.download_all(path)"
"reg_a.download_all()"
]
},
{
"cell_type": "markdown",
"id": "6970f0ad-9364-4732-a5e6-f93cf3fc31a3",
"id": "ad29285e-d161-46ea-8a57-95891fa2b237",
"metadata": {
"tags": [],
"user_expressions": []
},
"source": [
"We now have 19 available Argo profiles, each containing `temperature` and `pressure`, compiled into a Pandas DataFrame. **NOTE: BGC Argo is currently fully implemented** When BGC Argo is fully implemented to QUEST, we could add more variables to this list.\n",
"\n",
"We also have a series of files containing ICESat-2 ATL03 data. Because these data files are very large, we are only going to focus on one of these files for this example.\n",
"We now have 19 available Argo profiles, each containing `temperature` and `pressure`, compiled into a Pandas DataFrame. BGC Argo is also available through QUEST, so we could add more variables to this list.\n",
"\n",
"Let's now load one of the ICESat-2 files and see where it passes relative to the Argo float data.\n",
"If the user wishes to add more profiles, parameters, and/or pressure ranges to a pre-existing DataFrame, then they should use `reg_a.download_all(path, keep_existing=True)` to retain previously queried data."
]
},
{
"cell_type": "markdown",
"id": "6970f0ad-9364-4732-a5e6-f93cf3fc31a3",
"metadata": {
"user_expressions": []
},
"source": [
"The download function also provided a series of files containing ICESat-2 ATL03 data. Because these data files are very large, we are only going to focus on one file for this example.\n",
"\n",
"**Zach** would you be open to switching this to use icepyx's read module? We could easily use the `xarray.to_dataframe` to then work with the rest of this notebook!"
"The below workflow uses the icepyx Read module to quickly load ICESat-2 data into the XArray format."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "88f4b1b0-8c58-414c-b6a8-ce1662979943",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"#path_root = '/icepyx/quest-test-data/'\n",
"path_root = '/icepyx/quest-test-data/processed_ATL03_20220419002753_04111506_006_02.h5'\n",
"reader = ipx.Read(path_root)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "665d79a7-7360-4846-99c2-222b34df2a92",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"reader.vars.append(beam_list=['gt2l'], \n",
" var_list=['h_ph', \"lat_ph\", \"lon_ph\", 'signal_conf_ph'])"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "e7158814-50f0-4940-980c-9bb800360982",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"ds = reader.load()\n",
"ds"
]
},
{
"cell_type": "markdown",
"id": "1040438c-d806-4964-b4f0-1247da9f3f1f",
"metadata": {
"user_expressions": []
},
"source": [
"To make the data more easily plottable, let's convert the data into a Pandas DataFrame. Note that this method is memory-intensive for ATL03 data, so users are suggested to look at small spatial domains to prevent the notebook from crashing."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "bc086db7-f5a1-4ba7-ba90-5b19afaf6808",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"is2_pd = ds.to_dataframe()"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "fc67e039-338c-4348-acaf-96f605cf0030",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"# Rearrange the data to only include \"ocean\" photons\n",
"is2_pd = is2_pd.reset_index(level=[0,1,2])\n",
"is2_pd_ocean = is2_pd[is2_pd.index==1]\n",
"is2_pd_ocean"
]
},
{
Expand All @@ -299,14 +393,6 @@
},
"outputs": [],
"source": [
"# Load ICESat-2 latitudes, longitudes, heights, and photon confidence (optional)\n",
"is2_pd = pd.DataFrame()\n",
"with h5py.File(f'{path_root}processed_ATL03_20220419002753_04111506_006_02.h5', 'r') as f:\n",
" is2_pd['lat'] = f['gt2l/heights/lat_ph'][:]\n",
" is2_pd['lon'] = f['gt2l/heights/lon_ph'][:]\n",
" is2_pd['height'] = f['gt2l/heights/h_ph'][:]\n",
" is2_pd['signal_conf'] = f['gt2l/heights/signal_conf_ph'][:,1]\n",
" \n",
"# Set Argo data as its own DataFrame\n",
"argo_df = reg_a.datasets['argo'].argodata"
]
Expand All @@ -321,8 +407,8 @@
"outputs": [],
"source": [
"# Convert both DataFrames into GeoDataFrames\n",
"is2_gdf = gpd.GeoDataFrame(is2_pd, \n",
" geometry=gpd.points_from_xy(is2_pd.lon, is2_pd.lat),\n",
"is2_gdf = gpd.GeoDataFrame(is2_pd_ocean, \n",
" geometry=gpd.points_from_xy(is2_pd_ocean['lon_ph'], is2_pd_ocean['lat_ph']),\n",
" crs='EPSG:4326'\n",
")\n",
"argo_gdf = gpd.GeoDataFrame(argo_df, \n",
Expand All @@ -338,7 +424,22 @@
"user_expressions": []
},
"source": [
"To view the relative locations of ICESat-2 and Argo, the below cell uses the `explore()` function from GeoPandas. For large datasets like ICESat-2, loading the map might take a while."
"To view the relative locations of ICESat-2 and Argo, the below cell uses the `explore()` function from GeoPandas. The time variables cause errors in the function, so we will drop those variables first. \n",
"\n",
"Note that for large datasets like ICESat-2, loading the map might take a while."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "7178fecc-6ca1-42a1-98d4-08f57c050daa",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"# Drop time variables that would cause errors in explore() function\n",
"is2_gdf = is2_gdf.drop(['data_start_utc','data_end_utc','delta_time','atlas_sdp_gps_epoch'], axis=1)"
]
},
{
Expand All @@ -351,8 +452,8 @@
"outputs": [],
"source": [
"# Plot ICESat-2 track (medium/high confidence photons only) on a map\n",
"m = is2_gdf[is2_gdf['signal_conf']>=3].explore(tiles='Esri.WorldImagery',\n",
" name='ICESat-2')\n",
"m = is2_gdf[is2_gdf['signal_conf_ph']>=3].explore(column='rgt', tiles='Esri.WorldImagery',\n",
" name='ICESat-2')\n",
"\n",
"# Add Argo float locations to map\n",
"argo_gdf.explore(m=m, name='Argo', marker_kwds={\"radius\": 6}, color='red')"
Expand Down Expand Up @@ -408,7 +509,7 @@
"outputs": [],
"source": [
"# Only consider ICESat-2 signal photons\n",
"is2_pd_signal = is2_pd[is2_pd['signal_conf']>0]\n",
"is2_pd_signal = is2_pd_ocean[is2_pd_ocean['signal_conf_ph']>=0]\n",
"\n",
"## Multi-panel plot showing ICESat-2 and Argo data\n",
"\n",
Expand All @@ -425,7 +526,7 @@
"world = gpd.read_file(gpd.datasets.get_path('naturalearth_lowres'))\n",
"world.plot(ax=ax1, color='0.8', edgecolor='black')\n",
"argo_df.plot.scatter(ax=ax1, x='lon', y='lat', s=25.0, c='green', zorder=3, alpha=0.3)\n",
"is2_pd.plot.scatter(ax=ax1, x='lon', y='lat', s=10.0, zorder=2, alpha=0.3)\n",
"is2_pd_signal.plot.scatter(ax=ax1, x='lon_ph', y='lat_ph', s=10.0, zorder=2, alpha=0.3)\n",
"ax1.plot(lons, lats, linewidth=1.5, color='orange', zorder=2)\n",
"#df.plot(ax=ax2, x='lon', y='lat', marker='o', color='red', markersize=2.5, zorder=3)\n",
"ax1.set_xlim(-160,-100)\n",
Expand All @@ -436,7 +537,7 @@
"\n",
"# Plot Zoomed View of Ground Tracks\n",
"argo_df.plot.scatter(ax=ax2, x='lon', y='lat', s=50.0, c='green', zorder=3, alpha=0.3)\n",
"is2_pd.plot.scatter(ax=ax2, x='lon', y='lat', s=10.0, zorder=2, alpha=0.3)\n",
"is2_pd_signal.plot.scatter(ax=ax2, x='lon_ph', y='lat_ph', s=10.0, zorder=2, alpha=0.3)\n",
"ax2.plot(lons, lats, linewidth=1.5, color='orange', zorder=1)\n",
"ax2.scatter(-151.98956, 34.43885, color='orange', marker='^', s=80, zorder=4)\n",
"ax2.set_xlim(min(lons) - lon_margin, max(lons) + lon_margin)\n",
Expand All @@ -446,10 +547,10 @@
"ax2.set_ylabel('Latitude', fontsize=18)\n",
"\n",
"# Plot ICESat-2 along-track vertical profile. A dotted line notes the location of a nearby Argo float\n",
"is2 = ax3.scatter(is2_pd_signal['lat'], is2_pd_signal['height'], s=0.1)\n",
"is2 = ax3.scatter(is2_pd_signal['lat_ph'], is2_pd_signal['h_ph']+13.1, s=0.1)\n",
"ax3.axvline(34.43885, linestyle='--', linewidth=3, color='black')\n",
"ax3.set_xlim([34.3, 34.5])\n",
"ax3.set_ylim([-15, 5])\n",
"ax3.set_ylim([-20, 5])\n",
"ax3.set_xlabel('Latitude', fontsize=18)\n",
"ax3.set_ylabel('Approx. IS-2 Depth [m]', fontsize=16)\n",
"ax3.set_yticklabels(['15', '10', '5', '0', '-5'])\n",
Expand All @@ -467,6 +568,14 @@
"# Save figure\n",
"#plt.savefig('/icepyx/quest/figures/is2_argo_figure.png', dpi=500)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "9b6548e2-0662-4c8b-a251-55ca63aff99b",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
Expand All @@ -485,7 +594,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.10"
"version": "3.10.12"
}
},
"nbformat": 4,
Expand Down

0 comments on commit d888127

Please sign in to comment.