From cf47162e89db759dc0030e5df9d5f02f3e143413 Mon Sep 17 00:00:00 2001 From: Tasha Snow Date: Fri, 5 Jul 2024 20:34:33 -0600 Subject: [PATCH] Delete book/external/Earthdata_Cloud__Single_File__Direct_S3_Access_NetCDF4_Example.ipynb --- ...le__Direct_S3_Access_NetCDF4_Example.ipynb | 306 ------------------ 1 file changed, 306 deletions(-) delete mode 100644 book/external/Earthdata_Cloud__Single_File__Direct_S3_Access_NetCDF4_Example.ipynb diff --git a/book/external/Earthdata_Cloud__Single_File__Direct_S3_Access_NetCDF4_Example.ipynb b/book/external/Earthdata_Cloud__Single_File__Direct_S3_Access_NetCDF4_Example.ipynb deleted file mode 100644 index c4197fb..0000000 --- a/book/external/Earthdata_Cloud__Single_File__Direct_S3_Access_NetCDF4_Example.ipynb +++ /dev/null @@ -1,306 +0,0 @@ -{ - "cells": [ - { - "cell_type": "markdown", - "id": "0b2c1bfc", - "metadata": {}, - "source": [ - "# NetCDF/HDF5 Direct S3\n", - "\n", - "imported on: **2022-11-28**\n", - "\n", - "

This notebook is from NASA's OpenScapes NetCDF/HDF5 Direct S3 access notebook

\n", - "\n", - "> The original source for this document is [https://github.com/NASA-Openscapes/earthdata-cloud-cookbook](https://github.com/NASA-Openscapes/earthdata-cloud-cookbook)" - ] - }, - { - "cell_type": "markdown", - "id": "blocked-calibration", - "metadata": {}, - "source": [ - "# Accessing a NetCDF4/HDF5 File - S3 Direct Access" - ] - }, - { - "cell_type": "markdown", - "id": "positive-copper", - "metadata": {}, - "source": [ - "## Summary\n", - "\n", - "In this notebook, we will access monthly sea surface height from ECCO V4r4 (10.5067/ECG5D-SSH44). The data are provided as a time series of monthly netCDFs on a 0.5-degree latitude/longitude grid.\n", - "\n", - "We will access a single netCDF file from inside the AWS cloud (us-west-2 region, specifically) and load it into Python as an `xarray` `dataset`. This approach leverages S3 native protocols for efficient access to the data.\n", - "\n", - "## Requirements\n", - "\n", - "### 1. AWS instance running in us-west-2\n", - "\n", - "NASA Earthdata Cloud data in S3 can be directly accessed via temporary credentials; this access is limited to requests made within the US West (Oregon) (code: us-west-2) AWS region.\n", - "\n", - "### 2. Earthdata Login\n", - "\n", - "An Earthdata Login account is required to access data, as well as discover restricted data, from the NASA Earthdata system. Thus, to access NASA data, you need Earthdata Login. Please visit https://urs.earthdata.nasa.gov to register and manage your Earthdata Login account. This account is free to create and only takes a moment to set up.\n", - "\n", - "### 3. netrc File\n", - "\n", - "You will need a netrc file containing your NASA Earthdata Login credentials in order to execute the notebooks. A netrc file can be created manually within text editor and saved to your home directory. For additional information see: [Authentication for NASA Earthdata](https://nasa-openscapes.github.io/2021-Cloud-Hackathon/tutorials/04_NASA_Earthdata_Authentication.html#authentication-via-netrc-file).\n", - "\n", - "## Learning Objectives\n", - "\n", - "- how to retrieve temporary S3 credentials for in-region direct S3 bucket access\n", - "- how to perform in-region direct access of ECCO_L4_SSH_05DEG_MONTHLY_V4R4 data in S3\n", - "- how to plot the data" - ] - }, - { - "cell_type": "markdown", - "id": "found-priority", - "metadata": {}, - "source": [ - "---" - ] - }, - { - "cell_type": "markdown", - "id": "exclusive-domestic", - "metadata": {}, - "source": [ - "## Import Packages" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "indian-difference", - "metadata": {}, - "outputs": [], - "source": [ - "%matplotlib inline\n", - "import matplotlib.pyplot as plt\n", - "import os\n", - "import requests\n", - "import s3fs\n", - "from osgeo import gdal\n", - "import xarray as xr\n", - "import hvplot.xarray\n", - "import holoviews as hv" - ] - }, - { - "cell_type": "markdown", - "id": "phantom-nurse", - "metadata": {}, - "source": [ - "## Get Temporary AWS Credentials\n", - "\n", - "Direct S3 access is achieved by passing NASA supplied temporary credentials to AWS so we can interact with S3 objects from applicable Earthdata Cloud buckets. For now, each NASA DAAC has different AWS credentials endpoints. Below are some of the credential endpoints to various DAACs:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "copyrighted-python", - "metadata": {}, - "outputs": [], - "source": [ - "s3_cred_endpoint = {\n", - " 'podaac':'https://archive.podaac.earthdata.nasa.gov/s3credentials',\n", - " 'gesdisc': 'https://data.gesdisc.earthdata.nasa.gov/s3credentials',\n", - " 'lpdaac':'https://data.lpdaac.earthdatacloud.nasa.gov/s3credentials',\n", - " 'ornldaac': 'https://data.ornldaac.earthdata.nasa.gov/s3credentials',\n", - " 'ghrcdaac': 'https://data.ghrc.earthdata.nasa.gov/s3credentials'\n", - "}" - ] - }, - { - "cell_type": "markdown", - "id": "ee898746-9791-45ab-bce2-d63f72e1c523", - "metadata": {}, - "source": [ - "Create a function to make a request to an endpoint for temporary credentials. Remember, each DAAC has their own endpoint and credentials are not usable for cloud data from other DAACs." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "vocational-biotechnology", - "metadata": {}, - "outputs": [], - "source": [ - "def get_temp_creds(provider):\n", - " return requests.get(s3_cred_endpoint[provider]).json()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "sapphire-therapy", - "metadata": {}, - "outputs": [], - "source": [ - "temp_creds_req = get_temp_creds('podaac')\n", - "#temp_creds_req" - ] - }, - { - "cell_type": "markdown", - "id": "86dd2671-206a-4c28-aaf4-7491adcbead3", - "metadata": {}, - "source": [ - "## Set up an `s3fs` session for Direct Access\n", - "\n", - "`s3fs` sessions are used for authenticated access to s3 bucket and allows for typical file-system style operations. Below we create session by passing in the temporary credentials we recieved from our temporary credentials endpoint." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "cross-occasions", - "metadata": {}, - "outputs": [], - "source": [ - "fs_s3 = s3fs.S3FileSystem(anon=False, \n", - " key=temp_creds_req['accessKeyId'], \n", - " secret=temp_creds_req['secretAccessKey'], \n", - " token=temp_creds_req['sessionToken'])" - ] - }, - { - "cell_type": "markdown", - "id": "55f5ca85-be5f-4f3e-ac0d-3b8c18164680", - "metadata": {}, - "source": [ - "In this example we're interested in the ECCO data collection from NASA's PO.DAAC in Earthdata Cloud. Below we specify the s3 URL to the data asset in Earthdata Cloud. This URL can be found via [Earthdata Search](../tutorials/01_Earthdata_Search.md) or programmatically through the [CMR](https://nasa-openscapes.github.io/2021-Cloud-Hackathon/tutorials/01_Data_Discovery_CMR.html) and [CMR-STAC](https://nasa-openscapes.github.io/2021-Cloud-Hackathon/tutorials/02_Data_Discovery_CMR-STAC_API.html) APIs." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "postal-renewal", - "metadata": {}, - "outputs": [], - "source": [ - "s3_url = 's3://podaac-ops-cumulus-protected/ECCO_L4_SSH_05DEG_MONTHLY_V4R4/SEA_SURFACE_HEIGHT_mon_mean_2015-01_ECCO_V4r4_latlon_0p50deg.nc'" - ] - }, - { - "cell_type": "markdown", - "id": "1f89d5e3-5f02-4f3e-9f4f-fe13f9642c7e", - "metadata": {}, - "source": [ - "## Direct In-region Access\n", - "\n", - "Open with the netCDF file using the s3fs package, then load the cloud asset into an `xarray` `dataset`." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "loving-boost", - "metadata": {}, - "outputs": [], - "source": [ - "s3_file_obj = fs_s3.open(s3_url, mode='rb')" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "wired-sequence", - "metadata": {}, - "outputs": [], - "source": [ - "ssh_ds = xr.open_dataset(s3_file_obj, engine='h5netcdf')\n", - "ssh_ds" - ] - }, - { - "cell_type": "markdown", - "id": "f771a000-7558-43d8-b519-edb954ca484f", - "metadata": {}, - "source": [ - "Get the `SSH` variable as an `xarray` `dataarray`" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "6d44bf2c-717e-44bd-bc7c-c58085a50c07", - "metadata": {}, - "outputs": [], - "source": [ - "ssh_da = ssh_ds.SSH\n", - "ssh_da" - ] - }, - { - "cell_type": "markdown", - "id": "b2ae7e5f-23c8-4f92-8418-e374a570e063", - "metadata": {}, - "source": [ - "Plot the `SSH` `dataarray` for time **2015-01-16T12:00:00** using `hvplot`." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "compound-uganda", - "metadata": {}, - "outputs": [], - "source": [ - "ssh_da.hvplot.image(x='longitude', y='latitude', cmap='Spectral_r', geo=True, tiles='ESRI', global_extent=True)" - ] - }, - { - "cell_type": "markdown", - "id": "52cbc9cb-49fa-4b32-985c-46ce85768a9d", - "metadata": {}, - "source": [ - "---" - ] - }, - { - "cell_type": "markdown", - "id": "f0eb5ba5-4804-4bc6-b9e4-d38e404a4ec6", - "metadata": {}, - "source": [ - "## Resources\n", - "\n", - "[Direct access to ECCO data in S3 (from us-west-2)](https://github.com/podaac/ECCO/blob/main/Data_Access/cloud_direct_access_s3.ipynb)\n", - "\n", - "[Data_Access__Direct_S3_Access__PODAAC_ECCO_SSH](https://github.com/NASA-Openscapes/2021-Cloud-Hackathon/blob/main/tutorials/Additional_Resources__Data_Access__Direct_S3_Access__PODAAC_ECCO_SSH.ipynb) using CMR-STAC API to retrieve S3 links" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "3f582cd7-daac-4223-8358-61bd04380d62", - "metadata": {}, - "outputs": [], - "source": [] - } - ], - "metadata": { - "kernelspec": { - "display_name": "py39", - "language": "python", - "name": "py39" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.9.7" - } - }, - "nbformat": 4, - "nbformat_minor": 5 -}