Collection of notebooks showing how to download and process World Ocean Atlas data with R and Python. This workflow includes scripts showing how to transform netCDF
files into cloud-optimised formats such as zarr
for gridded data and parquet
for tabular data.
All files listed below can be foun under the scripts
folder. The first two digits in the script name indicate the order they should be run. The letter following the two digits in the name indicate whether the script was developed in Python (P
) or R (R
):
- Access and clip WOA data using Python: In this Jupyter notebook, we use Python to access WOA18 data from NOAA's THREDDS server, clip it using a shapefile, and save the data clipped for our area of interest.
- Download WOA data using R: In this R script, we download WOA23 data from NOAA's THREDDS server.
- Transform
netCDF
files intozarr
: In this Jupyter notebook, we use Python to transform the WOA23netCDF
files intozarr
, a file format for gridded data that is optimised for cloud computing. - Extract WOA23 data for FishMIP regions: In this Jupyter notebook, we use Python to extract WOA23 data from cloud optimised files produced by the 02P_WOA_netcdf_to_zarr script.
- Getting metadata from WOA23 files: In this Jupyter notebook, we use Python to extract metadata from the original WOA23
netcdf
files, which can be used at a later date to interpret data. - Calculating monthly timeseries using R: In this R script, we calculate the monthly timeseries from WOA data.
- Regridding WOA data: In this Jupyter notebook, we use Python to show how to calculate a climatology and how interpolate depth bins and regrid data.
- Useful functions: This Python script contains custom-made functions that are used in the Jupyter notebooks.
You can contribute a Python or R script to this repository by submitting a Pull Request.
If you have an idea for a new script, or if you spotted an error, please create an Issue or email us.