Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reorganize the documentation #521

Merged
merged 3 commits into from
Sep 16, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 9 additions & 7 deletions docs/environment.yml → ci/environment-docs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4,27 +4,29 @@ channels:
- nodefaults
dependencies:
- cftime
- furo
- distributed
- ecgtools
- fsspec>=2022.7.0
- gcsfs
- intake>=0.6.6
- jupyterlab
- matplotlib
- myst-nb
- pip
- pydantic>=1.9
- python-graphviz
- python=3.9
- python=3.10
- s3fs
- fsspec>=2022.7.0
- intake>=0.6.6
- pydantic>=1.9
- sphinx
- sphinx-copybutton
- sphinx-design
- watermark
- xarray-datatree
- xarray-datatree>=0.0.9
- xarray>=2022.06
- zarr>=2.12
- pip:
- furo>=2022.09.15
- tornado>=6.2
- sphinxext-opengraph
- autodoc_pydantic
- -r ../requirements.txt
- -e ..
1 change: 1 addition & 0 deletions docs/source/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,7 @@

nb_execution_mode = 'cache'
nb_execution_timeout = 600
nb_execution_raise_on_error = True

extlinks = {
'issue': ('https://github.com/intake/intake-esm/issues/%s', 'GH#'),
Expand Down
8 changes: 0 additions & 8 deletions docs/source/explanation/index.md

This file was deleted.

Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ kernelspec:
```{code-cell} ipython3
import intake

url = "https://gist.githubusercontent.com/andersy005/7f416e57acd8319b20fc2b88d129d2b8/raw/987b4b336d1a8a4f9abec95c23eed3bd7c63c80e/pangeo-gcp-subset.json"
url = "https://raw.githubusercontent.com/intake/intake-esm/main/tutorial-catalogs/GOOGLE-CMIP6.json"
cat = intake.open_esm_datastore(url)
cat
```
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,10 @@ import intake

url = "https://ncar-cesm-lens.s3-us-west-2.amazonaws.com/catalogs/aws-cesm1-le.json"
cat = intake.open_esm_datastore(url)
cat
```

```{code-cell} ipython3
cat.df.head()
```

Expand All @@ -24,7 +28,7 @@ By default, the
and is case sensitive:

```{code-cell} ipython3
cat.search(experiment="20C", long_name="wind").df
cat.search(experiment="20C", long_name="wind")
```

As you can see, the example above returns an empty catalog.
Expand All @@ -40,7 +44,7 @@ a given column. Let's search for:
- all entries whose variable long name **contains** `wind`

```{code-cell} ipython3
cat.search(experiment="20C", long_name="wind*").df
cat.search(experiment="20C", long_name="wind*")
```

Now, let's search for:
Expand All @@ -49,7 +53,12 @@ Now, let's search for:
- all entries whose variable long name **starts** with `wind`

```{code-cell} ipython3
cat.search(experiment="20C", long_name="^wind").df
cat_subset = cat.search(experiment="20C", long_name="^wind")
cat_subset
```

```{code-cell} ipython3
cat_subset.df
```

```{code-cell} ipython3
Expand Down
17 changes: 0 additions & 17 deletions docs/source/how-to/index.md

This file was deleted.

27 changes: 10 additions & 17 deletions docs/source/how-to/manipulate-catalog.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ The in-memory representation of an Earth System Model (ESM) catalog is a pandas
dataframe, and is accessible via the `.df` property:

```{code-cell} ipython3
url = "https://gist.githubusercontent.com/andersy005/7f416e57acd8319b20fc2b88d129d2b8/raw/987b4b336d1a8a4f9abec95c23eed3bd7c63c80e/pangeo-gcp-subset.json"
url ="https://raw.githubusercontent.com/intake/intake-esm/main/tutorial-catalogs/GOOGLE-CMIP6.json"
cat = intake.open_esm_datastore(url)
cat.df.head()
```
Expand All @@ -31,8 +31,7 @@ Let's say we are interested in datasets with the following attributes:

- `experiment_id=["historical"]`
- `table_id="Amon"`
- `variable_id="tas"`
- `source_id=['TaiESM1', 'AWI-CM-1-1-MR', 'AWI-ESM-1-1-LR', 'BCC-CSM2-MR', 'BCC-ESM1', 'CAMS-CSM1-0', 'CAS-ESM2-0', 'UKESM1-0-LL']`
- `variable_id="ua"`

In addition to these attributes, **we are interested in the first ensemble
member (member_id) of each model (source_id) only**.
Expand All @@ -47,17 +46,7 @@ We can run a query against the catalog:
cat_subset = cat.search(
experiment_id=["historical"],
table_id="Amon",
variable_id="tas",
source_id=[
"TaiESM1",
"AWI-CM-1-1-MR",
"AWI-ESM-1-1-LR",
"BCC-CSM2-MR",
"BCC-ESM1",
"CAMS-CSM1-0",
"CAS-ESM2-0",
"UKESM1-0-LL",
],
variable_id="ua",
)
cat_subset
```
Expand All @@ -83,6 +72,10 @@ df = grouped.first().reset_index()
df.groupby("source_id")["member_id"].nunique()
```

```{code-cell} ipython3
df
```

### Step 3: Attach the new dataframe to our catalog object

```{code-cell} ipython3
Expand All @@ -93,18 +86,18 @@ cat_subset
Let's load the subsetted catalog into a dictionary of datasets:

```{code-cell} ipython3
dsets = cat_subset.to_dataset_dict(xarray_open_kwargs={"consolidated": True})
dsets = cat_subset.to_dataset_dict()
[key for key in dsets]
```

```{code-cell} ipython3
dsets["CMIP.CAS.CAS-ESM2-0.historical.Amon.gn"]
dsets["CMIP.IPSL.IPSL-CM6A-LR.historical.Amon.gr"]
```

```{code-cell} ipython3
---
tags: [hide-input, hide-output]
---
import intake_esm # just to display version information
import intake_esm
intake_esm.show_versions()
```
114 changes: 107 additions & 7 deletions docs/source/index.md
Original file line number Diff line number Diff line change
@@ -1,33 +1,133 @@
# Welcome to Intake-esm's documentation!
---
sd_hide_title: true
---

# Overview

::::{grid}
:reverse:
:gutter: 3 4 4 4
:margin: 1 2 1 2

:::{grid-item}
:columns: 12 4 4 4

```{image} ../_static/images/NSF_4-Color_bitmap_Logo.png
:width: 200px
:class: sd-m-auto
```

`intake-esm` is a data cataloging utility built on top of [intake](https://github.com/intake/intake), [pandas](https://pandas.pydata.org/), and [xarray](https://xarray.pydata.org/en/stable/), and it's pretty awesome!
:::

:::{grid-item}
:columns: 12 8 8 8
:child-align: justify
:class: sd-fs-5

```{rubric} Intake-ESM

```

A data cataloging utility built on top of [intake](https://github.com/intake/intake), [pandas](https://pandas.pydata.org/), and [xarray](https://xarray.pydata.org/en/stable/), and it's pretty awesome!

```{button-ref} how-to/install-intake-esm
:ref-type: doc
:color: primary
:class: sd-rounded-pill

Get Started
```

:::

::::

---

## Motivation

Computer simulations of the Earth’s climate and weather generate huge amounts of data.
These data are often persisted on HPC systems or in the cloud across multiple data
assets of a variety of formats ([netCDF](https://www.unidata.ucar.edu/software/netcdf/), [zarr](https://zarr.readthedocs.io/en/stable/), etc...). Finding, investigating,
loading these data assets into compute-ready data containers costs time and effort.
The data user needs to know what data sets are available, the attributes describing
each data set, before loading a specific data set and analyzing it.

Finding, investigating, loading these assets into data array containers
such as xarray can be a daunting task due to the large number of files
a user may be interested in. Intake-esm aims to address these issues by
providing necessary functionality for searching, discovering, data access/loading.

---

## Get in touch

- If you encounter any errors or problems with **intake-esm**, please open an issue at the GitHub [main repository](http://github.com/intake/intake-esm/issues).
- If you have a question like “How do I find x?”, ask on [GitHub discussions](https://github.com/intake/intake-esm/discussions). Please include a self-contained reproducible example if possible.

---

```{toctree}
---
maxdepth: 1
caption: Tutorials
hidden:
---
tutorials/loading-cmip6-data.md
```

```{toctree}
---
maxdepth: 2
caption: How to Guides and Examples
hidden:
---

how-to/install-intake-esm.md
how-to/build-a-catalog-from-timeseries-files.md
how-to/define-and-use-derived-variable-registry.md
how-to/use-catalogs-with-assets-containing-multiple-variables.md
how-to/filter-catalog-by-substring-and-regex-criteria.md
how-to/enforce-search-query-criteria-via-require-all-on.md
how-to/manipulate-catalog.md
```

```{toctree}
---
maxdepth: 2
caption: Reference
hidden:
---

reference/esm-catalog-spec.md
reference/api.md
reference/faq.md
reference/cmip_ap.md


tutorials/index.md
how-to/index.md
explanation/index.md
reference/index.md

```

```{toctree}
---
maxdepth: 2
caption: Contribute to intake-esm
caption: Development
hidden:
---

contributing.md
reference/changelog.md

```

```{toctree}
---
maxdepth: 2
caption: Project Links
hidden:
---


GitHub Repo <https://github.com/intake/intake-esm>
GitHub discussions <https://github.com/intake/intake-esm/discussions>

Expand Down
11 changes: 0 additions & 11 deletions docs/source/reference/index.md

This file was deleted.

Loading