Skip to content

Commit

Permalink
Reorganize the documentation (#521)
Browse files Browse the repository at this point in the history
  • Loading branch information
andersy005 authored Sep 16, 2022
1 parent 291bbb5 commit 09ad431
Show file tree
Hide file tree
Showing 12 changed files with 152 additions and 92 deletions.
16 changes: 9 additions & 7 deletions docs/environment.yml → ci/environment-docs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4,27 +4,29 @@ channels:
- nodefaults
dependencies:
- cftime
- furo
- distributed
- ecgtools
- fsspec>=2022.7.0
- gcsfs
- intake>=0.6.6
- jupyterlab
- matplotlib
- myst-nb
- pip
- pydantic>=1.9
- python-graphviz
- python=3.9
- python=3.10
- s3fs
- fsspec>=2022.7.0
- intake>=0.6.6
- pydantic>=1.9
- sphinx
- sphinx-copybutton
- sphinx-design
- watermark
- xarray-datatree
- xarray-datatree>=0.0.9
- xarray>=2022.06
- zarr>=2.12
- pip:
- furo>=2022.09.15
- tornado>=6.2
- sphinxext-opengraph
- autodoc_pydantic
- -r ../requirements.txt
- -e ..
1 change: 1 addition & 0 deletions docs/source/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,7 @@

nb_execution_mode = 'cache'
nb_execution_timeout = 600
nb_execution_raise_on_error = True

extlinks = {
'issue': ('https://github.com/intake/intake-esm/issues/%s', 'GH#'),
Expand Down
8 changes: 0 additions & 8 deletions docs/source/explanation/index.md

This file was deleted.

Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ kernelspec:
```{code-cell} ipython3
import intake
url = "https://gist.githubusercontent.com/andersy005/7f416e57acd8319b20fc2b88d129d2b8/raw/987b4b336d1a8a4f9abec95c23eed3bd7c63c80e/pangeo-gcp-subset.json"
url = "https://raw.githubusercontent.com/intake/intake-esm/main/tutorial-catalogs/GOOGLE-CMIP6.json"
cat = intake.open_esm_datastore(url)
cat
```
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,10 @@ import intake
url = "https://ncar-cesm-lens.s3-us-west-2.amazonaws.com/catalogs/aws-cesm1-le.json"
cat = intake.open_esm_datastore(url)
cat
```

```{code-cell} ipython3
cat.df.head()
```

Expand All @@ -24,7 +28,7 @@ By default, the
and is case sensitive:

```{code-cell} ipython3
cat.search(experiment="20C", long_name="wind").df
cat.search(experiment="20C", long_name="wind")
```

As you can see, the example above returns an empty catalog.
Expand All @@ -40,7 +44,7 @@ a given column. Let's search for:
- all entries whose variable long name **contains** `wind`

```{code-cell} ipython3
cat.search(experiment="20C", long_name="wind*").df
cat.search(experiment="20C", long_name="wind*")
```

Now, let's search for:
Expand All @@ -49,7 +53,12 @@ Now, let's search for:
- all entries whose variable long name **starts** with `wind`

```{code-cell} ipython3
cat.search(experiment="20C", long_name="^wind").df
cat_subset = cat.search(experiment="20C", long_name="^wind")
cat_subset
```

```{code-cell} ipython3
cat_subset.df
```

```{code-cell} ipython3
Expand Down
17 changes: 0 additions & 17 deletions docs/source/how-to/index.md

This file was deleted.

27 changes: 10 additions & 17 deletions docs/source/how-to/manipulate-catalog.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ The in-memory representation of an Earth System Model (ESM) catalog is a pandas
dataframe, and is accessible via the `.df` property:

```{code-cell} ipython3
url = "https://gist.githubusercontent.com/andersy005/7f416e57acd8319b20fc2b88d129d2b8/raw/987b4b336d1a8a4f9abec95c23eed3bd7c63c80e/pangeo-gcp-subset.json"
url ="https://raw.githubusercontent.com/intake/intake-esm/main/tutorial-catalogs/GOOGLE-CMIP6.json"
cat = intake.open_esm_datastore(url)
cat.df.head()
```
Expand All @@ -31,8 +31,7 @@ Let's say we are interested in datasets with the following attributes:

- `experiment_id=["historical"]`
- `table_id="Amon"`
- `variable_id="tas"`
- `source_id=['TaiESM1', 'AWI-CM-1-1-MR', 'AWI-ESM-1-1-LR', 'BCC-CSM2-MR', 'BCC-ESM1', 'CAMS-CSM1-0', 'CAS-ESM2-0', 'UKESM1-0-LL']`
- `variable_id="ua"`

In addition to these attributes, **we are interested in the first ensemble
member (member_id) of each model (source_id) only**.
Expand All @@ -47,17 +46,7 @@ We can run a query against the catalog:
cat_subset = cat.search(
experiment_id=["historical"],
table_id="Amon",
variable_id="tas",
source_id=[
"TaiESM1",
"AWI-CM-1-1-MR",
"AWI-ESM-1-1-LR",
"BCC-CSM2-MR",
"BCC-ESM1",
"CAMS-CSM1-0",
"CAS-ESM2-0",
"UKESM1-0-LL",
],
variable_id="ua",
)
cat_subset
```
Expand All @@ -83,6 +72,10 @@ df = grouped.first().reset_index()
df.groupby("source_id")["member_id"].nunique()
```

```{code-cell} ipython3
df
```

### Step 3: Attach the new dataframe to our catalog object

```{code-cell} ipython3
Expand All @@ -93,18 +86,18 @@ cat_subset
Let's load the subsetted catalog into a dictionary of datasets:

```{code-cell} ipython3
dsets = cat_subset.to_dataset_dict(xarray_open_kwargs={"consolidated": True})
dsets = cat_subset.to_dataset_dict()
[key for key in dsets]
```

```{code-cell} ipython3
dsets["CMIP.CAS.CAS-ESM2-0.historical.Amon.gn"]
dsets["CMIP.IPSL.IPSL-CM6A-LR.historical.Amon.gr"]
```

```{code-cell} ipython3
---
tags: [hide-input, hide-output]
---
import intake_esm # just to display version information
import intake_esm
intake_esm.show_versions()
```
114 changes: 107 additions & 7 deletions docs/source/index.md
Original file line number Diff line number Diff line change
@@ -1,33 +1,133 @@
# Welcome to Intake-esm's documentation!
---
sd_hide_title: true
---

# Overview

::::{grid}
:reverse:
:gutter: 3 4 4 4
:margin: 1 2 1 2

:::{grid-item}
:columns: 12 4 4 4

```{image} ../_static/images/NSF_4-Color_bitmap_Logo.png
:width: 200px
:class: sd-m-auto
```

`intake-esm` is a data cataloging utility built on top of [intake](https://github.com/intake/intake), [pandas](https://pandas.pydata.org/), and [xarray](https://xarray.pydata.org/en/stable/), and it's pretty awesome!
:::

:::{grid-item}
:columns: 12 8 8 8
:child-align: justify
:class: sd-fs-5

```{rubric} Intake-ESM
```

A data cataloging utility built on top of [intake](https://github.com/intake/intake), [pandas](https://pandas.pydata.org/), and [xarray](https://xarray.pydata.org/en/stable/), and it's pretty awesome!

```{button-ref} how-to/install-intake-esm
:ref-type: doc
:color: primary
:class: sd-rounded-pill
Get Started
```

:::

::::

---

## Motivation

Computer simulations of the Earth’s climate and weather generate huge amounts of data.
These data are often persisted on HPC systems or in the cloud across multiple data
assets of a variety of formats ([netCDF](https://www.unidata.ucar.edu/software/netcdf/), [zarr](https://zarr.readthedocs.io/en/stable/), etc...). Finding, investigating,
loading these data assets into compute-ready data containers costs time and effort.
The data user needs to know what data sets are available, the attributes describing
each data set, before loading a specific data set and analyzing it.

Finding, investigating, loading these assets into data array containers
such as xarray can be a daunting task due to the large number of files
a user may be interested in. Intake-esm aims to address these issues by
providing necessary functionality for searching, discovering, data access/loading.

---

## Get in touch

- If you encounter any errors or problems with **intake-esm**, please open an issue at the GitHub [main repository](http://github.com/intake/intake-esm/issues).
- If you have a question like “How do I find x?”, ask on [GitHub discussions](https://github.com/intake/intake-esm/discussions). Please include a self-contained reproducible example if possible.

---

```{toctree}
---
maxdepth: 1
caption: Tutorials
hidden:
---
tutorials/loading-cmip6-data.md
```

```{toctree}
---
maxdepth: 2
caption: How to Guides and Examples
hidden:
---
how-to/install-intake-esm.md
how-to/build-a-catalog-from-timeseries-files.md
how-to/define-and-use-derived-variable-registry.md
how-to/use-catalogs-with-assets-containing-multiple-variables.md
how-to/filter-catalog-by-substring-and-regex-criteria.md
how-to/enforce-search-query-criteria-via-require-all-on.md
how-to/manipulate-catalog.md
```

```{toctree}
---
maxdepth: 2
caption: Reference
hidden:
---
reference/esm-catalog-spec.md
reference/api.md
reference/faq.md
reference/cmip_ap.md
tutorials/index.md
how-to/index.md
explanation/index.md
reference/index.md
```

```{toctree}
---
maxdepth: 2
caption: Contribute to intake-esm
caption: Development
hidden:
---
contributing.md
reference/changelog.md
```

```{toctree}
---
maxdepth: 2
caption: Project Links
hidden:
---
GitHub Repo <https://github.com/intake/intake-esm>
GitHub discussions <https://github.com/intake/intake-esm/discussions>
Expand Down
File renamed without changes.
11 changes: 0 additions & 11 deletions docs/source/reference/index.md

This file was deleted.

Loading

0 comments on commit 09ad431

Please sign in to comment.