Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fail docs build when there are broken links #535

Closed
wants to merge 27 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
b8d0612
start to reorganize Overview page
Feb 20, 2024
02cf2ec
reorganized docs
Mar 19, 2024
cd67868
Merge branch 'main' into update-documentation
Mar 19, 2024
998ea02
Make "Welcome" page single level
Apr 2, 2024
bd0bf69
add user_guide section
Apr 2, 2024
b5648d7
add user guide page stubs
Apr 2, 2024
34f6b1e
remove glossary link from welcome
Apr 16, 2024
60ab0ac
Move license to welcome and remove level of support sections.
Apr 16, 2024
bf82ad7
move getting started to quick_start.md
Apr 16, 2024
5d5c9e6
update headings
Apr 16, 2024
774afce
Update headings and fix text
Apr 16, 2024
0431c6c
Fix block mapping
Apr 16, 2024
b17dd83
Update contributing doc from issues surfaced in user testing
mfisher87 Apr 19, 2024
6fcaacd
Clarify minimally reproducible example
mfisher87 Apr 19, 2024
8de415d
Move contributing details into documentation
mfisher87 Apr 19, 2024
f550c35
Add nav link to contributing doc
mfisher87 Apr 19, 2024
e330ee1
Merge branch 'main' into update-documentation
mfisher87 Apr 19, 2024
50bc434
Add content about hack days to our docs
mfisher87 Apr 19, 2024
88951d1
Fix up broken callout and add title
mfisher87 Apr 10, 2024
07c6c83
Remove accidentally-re-added config
mfisher87 Apr 19, 2024
d593b81
Merge remote-tracking branch 'upstream/contributing-updates-from-user…
mfisher87 Apr 19, 2024
b1417ca
Move meet-up page under contributing
mfisher87 Apr 19, 2024
e75f9c8
Split out development environment and releasing docs
mfisher87 Apr 19, 2024
436fced
Enable navigation index pages
mfisher87 Apr 19, 2024
9a9d962
Un-nest pages from "Welcome" section
mfisher87 Apr 19, 2024
9976a1a
Split the README into a GitHub README and doc page
mfisher87 Apr 19, 2024
2a66a15
Enable mkdocs strict mode to catch broken links
mfisher87 Apr 20, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
166 changes: 4 additions & 162 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -1,165 +1,7 @@
# Contributing

When contributing to this repository, please first discuss the change you wish to make via issue,
email, or any other method with the owners of this repository before making a change.
_earthaccess_ is community-owned and welcomes all forms of contributions from the public.

Please note that we have a [code of conduct](./CODE_OF_CONDUCT.md). Please follow it in all of your interactions with the project.

## Development environment


`earthaccess` is a Python library that uses Poetry to build and publish the package to PyPI, the defacto Python repository. In order to develop new features or patch bugs etc. we need to set up a virtual environment and install the library locally. We can accomplish this with both Poetry or/and Conda.

### Using Conda

If we have `mamba` (or `conda`) installed, we can use the environment file included in the `ci` folder. This will install all the libraries we need (including Poetry) to start developing `earthaccess`:

```bash
mamba env update -f ci/environment-dev.yml
mamba activate earthaccess-dev
poetry install
```

After activating our environment and installing the library with Poetry we can run Jupyter lab and start testing the local distribution or we can use `make` to run the tests and lint the code.
Now we can create a feature branch and push those changes to our fork!

### Using Poetry

If we want to use Poetry, first we need to [install it](https://python-poetry.org/docs/#installation). After installing Poetry we can use the same workflow we used for Conda, first we install the library locally:

```bash
poetry install
```

and now we can run the local Jupyter Lab and run the scripts etc. using Poetry:

```bash
poetry run jupyter lab
```

### Managing Dependencies

If you need to add a dependency, you should do the following:

- Run `poetry add <package>` for a required (non-development) dependency
- Run `poetry add --group=dev <package>` for a development dependency, such
as a testing or code analysis dependency

Both commands will add an entry to `pyproject.toml` with a version that is
compatible with the rest of the dependencies. However, `poetry` pins versions
with a caret (`^`), which is not what we want. Therefore, you must locate the
new entry in `pyproject.toml` and change the `^` to `>=`. (See
[poetry-relax](https://github.com/zanieb/poetry-relax) for the reasoning behind
this.)

In addition, you must also add a corresponding entry to
`ci/environment-mindeps.yaml`. You'll notice in that file that required
dependencies should be pinned exactly to the versions specified in
`pyproject.toml` (after changing `^` to `>=` there), and that development
dependencies should be left unpinned.

Finally, for _development dependencies only_, you must add an entry to
`ci/environment-dev.yaml` with the same version constraint as in
`pyproject.toml`.

## First Steps to fix an issue or bug

- Read the documentation (working on adding more)
- create the minimally reproducible issue
- try to edit the relevant code and see if it fixes it
- submit the fix to the problem as a pull request
- include an explanation of what you did and why

## First steps to contribute new features

- Create an issue to discuss the feature's scope and its fit for this package
- run pytest to ensure your local version of code passes all unit tests
- try to edit the relevant code and implement your new feature in a backwards compatible manner
- create new tests as you go, and run the test suite as you go
- update the documentation as you go

### Please format and lint as you go

```bash
make format lint
```

We attempt to provide comprehensive type annotations within this repository. If
you do not provide fully annotated functions or methods, the `lint` command will
fail. Over time, we plan to increase type-checking strictness in order to
ensure more precise, beneficial type annotations.

We have included type stubs for the untyped `python-cmr` library, which we
intend to eventually upstream. Since `python-cmr` exposes the `cmr` package,
the stubs appear under `stubs/cmr`.

### Requirements to merge code (Pull Request Process)

- you must include test coverage
- you must update the documentation
- you must run the command above to format and lint

## Pull Request process

1. Ensure you include test coverage for all changes
1. Ensure your code is formatted properly following this document
1. Update the documentation and the `README.md` with details of changes to the
interface, this includes new environment variables, function names,
decorators, etc.
1. Update `CHANGELOG.md` with details about your change in a section titled
`Unreleased`. If one does not exist, please create one.
1. You may merge the Pull Request once you have the sign-off of another
developer, or if you do not have permission to do that, you may request the
reviewer to merge it for you.

## Release process

> :memo: The versioning scheme we use is [SemVer](http://semver.org/). Note that until
> we agree we're ready for v1.0.0, we will not increment the major version.

1. Ensure all desired features are merged to `main` branch and `CHANGELOG.md` is updated.
1. Use `bump-my-version` to increase the version number in all needed places, e.g. to
increase the minor version (`1.2.3` to `1.3.0`):

```plain
bump-my-version bump minor
```

1. Push a tag on the new commit containing the version number, prefixed with `v`, e.g.
`v1.3.0`.
1. [Create a new GitHub Release](https://github.com/nsidc/earthaccess/releases/new). We
hand-curate our release notes to be valuable to humans. Please do not auto-generate
release notes and aim for consistency with the GitHub Release descriptions from other
releases.

> :gear: After the GitHub release is published, multiple automations will trigger:
>
> - Zenodo will create a new DOI.
> - GitHub Actions will publish a PyPI release.

> :memo: `earthaccess` is published to conda-forge through the
> [earthdata-feedstock](https://github.com/conda-forge/earthdata-feedstock), as this
> project was renamed early in its life. The conda package is named `earthaccess`.

## Steps to make changes to documentation

1. Fork [earthaccess](https://github.com/nsidc/earthaccess) in the GitHub user interface to create your own copy. Later on, you may need to sync your fork with the upstream original repository. This can also be done in the GitHub UI or command line. If you get stuck, the emergency escape hatch is to take a fresh fork again! :)
2. Clone the repo: `git clone [email protected]:<yourusername>/earthaccess.git`
3. Change the directory: `cd earthaccess\binder`
4. Create conda environment: `conda env create -f environment-dev.yml`. If you see a warning that the environment already exists, do `conda env remove -n earthaccess-dev`
5. Activate conda: `conda activate earthaccess-dev`
6. Change to the base project directory. `cd ..`
7. Install packages : `pip install --editable .`
8. Run mkdocs script: `./scripts/docs-live.sh`
10. On your browser, go to: `https://0.0.0.0:8008`
11. You can now change any pages in the `docs` folder in your text editor, which will instantly reflect in the browser.
12. Commit the changes and push them to the forked repository:
```bash
git status # check git status to see what changed
git switch -c "test" # create a new branch
git add . # add changes
git commit -m "add commit messages" # commit changes
git push -u origin test # push changes
```
13. Open a pull request (PR) in the GitHub user interface from your fork to the original `nsidc/earthaccess` repo. When you ran `git push` in a previous step, it provided a convenient link to open that PR directly.
14. In the PR interface, you can view the progress of the GitHub Actions workflows specific to the PR at the bottom of the page.
Please view
[our documentation's "contributing" page](https://earthaccess.readthedocs.io/en/latest/contributing)
to learn more!
154 changes: 11 additions & 143 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
# _earthaccess_

<p align="center">
<img alt="earthaccess, a python library to search, download or stream NASA Earth science data with just a few lines of code" src="https://user-images.githubusercontent.com/717735/205517116-7a5d0f41-7acc-441e-94ba-2e541bfb7fc8.png" width="70%" align="center" />
</p>
Expand Down Expand Up @@ -30,166 +32,32 @@

</p>

## **Overview**

*earthaccess* is a **python library to search, download or stream NASA Earth science data** with just a few lines of code.


In the age of cloud computing, the power of open science only reaches its full potential if we have easy-to-use workflows that facilitate research in an inclusive, efficient and reproducible way. Unfortunately —as it stands today— scientists and students alike face a steep learning curve adapting to systems that have grown too complex and end up spending more time on the technicalities of the tools, cloud and NASA APIs than focusing on their important science.

During several workshops organized by [NASA Openscapes](https://nasa-openscapes.github.io/events.html), the need to provide easy-to-use tools to our users became evident. Open science is a collaborative effort; it involves people from different technical backgrounds, and the data analysis to solve the pressing problems we face cannot be limited by the complexity of the underlying systems. Therefore, providing easy access to NASA Earthdata regardless of the data storage location (hosted within or outside of the cloud) is the main motivation behind this Python library.

## **Installing earthaccess**

You will need Python 3.8 or higher installed.

Install the latest release using conda

```bash
conda install -c conda-forge earthaccess
```
`earthaccess` is a python library to **search for**, and **download** or **stream** NASA Earth science data with just a few lines of code.

Using Pip

```bash
pip install earthaccess
```
Visit [our documentation](https://earthaccess.readthedocs.io/en/latest) to learn more!

Try it in your browser without installing anything! [![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/nsidc/earthaccess/main)


## **Usage**


With *earthaccess* we can login, search and download data with a few lines of code and even more relevant, our code will work the same way if we are running it in the cloud or from our laptop. ***earthaccess*** handles authentication with [NASA's Earthdata Login (EDL)](https://urs.earthdata.nasa.gov), search using NASA's [CMR](https://cmr.earthdata.nasa.gov/search/site/docs/search/api.html) and access through [`fsspec`](https://github.com/fsspec/filesystem_spec).

The only requirement to use this library is to open a free account with NASA [EDL](https://urs.earthdata.nasa.gov).


### **Authentication**

By default, `earthaccess` with automatically look for your EDL account credentials in two locations:

1. A `~/.netrc` file
2. `EARTHDATA_USERNAME` and `EARTHDATA_PASSWORD` environment variables

If neither of these options are configured, you can authenticate by calling the `earthaccess.login()` method
and manually entering your EDL account credentials.

```python
import earthaccess

earthaccess.login()
```

Note you can pass `persist=True` to `earthaccess.login()` to have the EDL account credentials you enter
automatically saved to a `~/.netrc` file for future use.


Once you are authenticated with NASA EDL you can:

* Get a file from a DAAC using a `fsspec` session.
* Request temporary S3 credentials from a particular DAAC (needed to download or stream data from an S3 bucket in the cloud).
* Use the library to download or stream data directly from S3.
* Regenerate CMR tokens (used for restricted datasets)


### **Searching for data**

Once we have selected our dataset we can search for the data granules using *doi*, *short_name* or *concept_id*.
If we are not sure or we don't know how to search for a particular dataset, we can start with the ["Introducing NASA earthaccess"](https://nsidc.github.io/earthaccess/tutorials/demo/#querying-for-datasets) tutorial or through the [NASA Earthdata Search portal](https://search.earthdata.nasa.gov/). For a complete list of search parameters we can use visit the extended [API documentation](https://earthaccess.readthedocs.io/en/latest/user-reference/api/api/).

```python

results = earthaccess.search_data(
short_name='SEA_SURFACE_HEIGHT_ALT_GRIDS_L4_2SATS_5DAY_6THDEG_V_JPL2205',
cloud_hosted=True,
bounding_box=(-10, 20, 10, 50),
temporal=("1999-02", "2019-03"),
count=10
)


```
## How to Get Started with `earthaccess`

Now that we have our results we can do multiple things: We can iterate over them to get HTTP (or S3) links, we can download the files to a local folder, or we can open these files and stream their content directly to other libraries e.g. xarray.
Visit [our quick start guide](https://earthaccess.readthedocs.io/en/latest/quick-start.html) to learn how to install and see a simple example of using `earthaccess`.

### **Accessing the data**

**Option 1: Using the data links**

If we already have a workflow in place for downloading our data, we can use *earthaccess* as a search-only library and get HTTP links from our query results. This could be the case if our current workflow uses a different language and we only need the links as input.

```python

# if the data set is cloud hosted there will be S3 links available. The access parameter accepts "direct" or "external", direct access is only possible if you are in the us-west-2 region in the cloud.
data_links = [granule.data_links(access="direct") for granule in results]

# or if the data is an on-prem dataset
data_links = [granule.data_links(access="external") for granule in results]

```

> Note: *earthaccess* can get S3 credentials for us, or auhenticated HTTP sessions in case we want to use them with a different library.

**Option 2: Download data to a local folder**

This option is practical if you have the necessary space available on disk. The *earthaccess* library will print out the approximate size of the download and its progress.
```python
files = earthaccess.download(results, "./local_folder")

```

**Option 3: Direct S3 Access - Stream data directly to xarray**

This method works best if you are in the same Amazon Web Services (AWS) region as the data (us-west-2) and you are working with gridded datasets (processing level 3 and above).

```python
import xarray as xr

files = earthaccess.open(results)

ds = xr.open_mfdataset(files)

```

And that's it! Just one line of code, and this same piece of code will also work for data that are not hosted in the cloud, i.e. located at NASA storage centers.


> More examples coming soon!


### Compatibility
## Compatibility

Only **Python 3.8+** is supported.


## How to Contribute to `earthaccess`

If you want to find out how to contribute to `earthaccess` checkout the [Contributing Guide](https://earthaccess.readthedocs.io/en/latest/contributing/).

## Contributors

[![Contributors](https://contrib.rocks/image?repo=nsidc/earthaccess)](https://github.com/nsidc/earthaccess/graphs/contributors)

## Contributing Guide
### Contributors

Welcome! 😊👋
[![Contributors](https://contrib.rocks/image?repo=nsidc/earthaccess)](https://github.com/nsidc/earthaccess/graphs/contributors)

> Please see the [Contributing Guide](CONTRIBUTING.md).

### [Project Board](https://github.com/nsidc/earthdata/discussions).

### Glossary

<a href="https://www.earthdata.nasa.gov/learn/glossary">NASA Earth Science Glossary</a>

## License

earthaccess is licensed under the MIT license. See [LICENSE](LICENSE.txt).

## Level of Support

<div><img src="https://raw.githubusercontent.com/nsidc/earthdata/main/docs/nsidc-logo.png" width="84px" align="left" text-align="middle"/>
<br>
This repository is supported by a joint effort of NSIDC, NASA DAACs, and the Earth science community, and we welcome any contribution in the form of issue submissions, pull requests, or discussions. Issues labeled as https://github.com/nsidc/earthaccess/labels/good%20first%20issue are a great place to get started.
</div>

1 change: 0 additions & 1 deletion docs/CONTRIBUTING.md

This file was deleted.

Loading
Loading