Skip to content

Commit

Permalink
Merge pull request #339 from ICESat2-SlideRule/precommit
Browse files Browse the repository at this point in the history
Precommit spell check
  • Loading branch information
jpswinski authored Oct 3, 2023
2 parents eea1977 + 101f684 commit c8b39b8
Show file tree
Hide file tree
Showing 44 changed files with 97 additions and 724 deletions.
15 changes: 15 additions & 0 deletions .github/workflows/pre-commit.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
name: Linting and formatting (pre-commit)

on:
push:
branches: [ main ]
pull_request:
branches: [ main ]

jobs:
pre-commit:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- uses: actions/setup-python@v4
- uses: pre-commit/[email protected]
13 changes: 13 additions & 0 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
repos:
# Fix common spelling mistakes
- repo: https://github.com/codespell-project/codespell
rev: v2.2.1
hooks:
- id: codespell
args: [
'--ignore-words-list', 'parm,parms',
# '--ignore-regex', '\bhist\b',
'--'
]
types_or: [python, rst, markdown]
files: ^(clients|docs|packages|platforms|plugins|scripts|targets)/
2 changes: 1 addition & 1 deletion clients/python/sliderule/earthdata.py
Original file line number Diff line number Diff line change
Expand Up @@ -701,7 +701,7 @@ def tnm(short_name, polygon=None, time_start=None, time_end=datetime.utcnow().st
>>> geojson
{'type': 'FeatureCollection', 'features': [{'type': 'Feature', 'id': '5eaa4a0582cefae35a21ee8c', 'geometry': {'type': 'Polygon'...
'''
# Flatten polygon (the list must be formated as 'x y, x y, x y, x y, x y', the documentation is incorrect)
# Flatten polygon (the list must be formatted as 'x y, x y, x y, x y, x y', the documentation is incorrect)
coord_list = []
for coord in polygon:
coord_list.append('{} {}'.format(coord["lon"], coord["lat"]))
Expand Down
18 changes: 0 additions & 18 deletions clients/python/sliderule/icesat2.py
Original file line number Diff line number Diff line change
Expand Up @@ -306,8 +306,6 @@ def atl06 (parm, resource):
parameters used to configure ATL06-SR algorithm processing (see `Parameters </web/rtds/user_guide/ICESat-2.html#parameters>`_)
resource: str
ATL03 HDF5 filename
asset: str
data source asset (see `Assets </web/rtd/user_guide/ICESat-2.html#assets>`_)
Returns
-------
Expand Down Expand Up @@ -336,10 +334,6 @@ def atl06p(parm, callbacks={}, resources=None, keep_id=False, as_numpy_array=Fal
----------
parms: dict
parameters used to configure ATL06-SR algorithm processing (see `Parameters </web/rtd/user_guide/ICESat-2.html#parameters>`_)
asset: str
data source asset (see `Assets </web/rtd/user_guide/ICESat-2.html#assets>`_)
version: str
the version of the ATL03 data to use for processing
callbacks: dictionary
a callback function that is called for each result record
resources: list
Expand Down Expand Up @@ -414,8 +408,6 @@ def atl03s (parm, resource):
parameters used to configure ATL03 subsetting (see `Parameters </web/rtd/user_guide/ICESat-2.html#parameters>`_)
resource: str
ATL03 HDF5 filename
asset: str
data source asset (see `Assets </web/rtd/user_guide/ICESat-2.html#assets>`_)
Returns
-------
Expand Down Expand Up @@ -443,10 +435,6 @@ def atl03sp(parm, callbacks={}, resources=None, keep_id=False, height_key=None):
----------
parms: dict
parameters used to configure ATL03 subsetting (see `Parameters </web/rtd/user_guide/ICESat-2.html#parameters>`_)
asset: str
data source asset (see `Assets </web/rtd/user_guide/ICESat-2.html#assets>`_)
version: str
the version of the ATL03 data to return
callbacks: dictionary
a callback function that is called for each result record
resources: list
Expand Down Expand Up @@ -584,8 +572,6 @@ def atl08 (parm, resource):
parameters used to configure ATL06-SR algorithm processing (see `Parameters </web/rtds/user_guide/ICESat-2.html#parameters>`_)
resource: str
ATL03 HDF5 filename
asset: str
data source asset (see `Assets </web/rtd/user_guide/ICESat-2.html#assets>`_)
Returns
-------
Expand Down Expand Up @@ -614,10 +600,6 @@ def atl08p(parm, callbacks={}, resources=None, keep_id=False, as_numpy_array=Fal
----------
parms: dict
parameters used to configure ATL06-SR algorithm processing (see `Parameters </web/rtd/user_guide/ICESat-2.html#parameters>`_)
asset: str
data source asset (see `Assets </web/rtd/user_guide/ICESat-2.html#assets>`_)
version: str
the version of the ATL03 data to use for processing
callbacks: dictionary
a callback function that is called for each result record
resources: list
Expand Down
18 changes: 6 additions & 12 deletions clients/python/sliderule/ipxapi.py
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@
#
# ICEPYX ATL06
#
def atl06p(ipx_region, parm, asset=icesat2.DEFAULT_ASSET):
def atl06p(ipx_region, parm):
"""
Performs ATL06-SR processing in parallel on ATL03 data and returns geolocated elevations. The list of granules to be processed is identified by the ipx_region object.
Expand All @@ -56,16 +56,13 @@ def atl06p(ipx_region, parm, asset=icesat2.DEFAULT_ASSET):
icepyx region object defining the query of granules to be processed
parm: dict
parameters used to configure ATL06-SR algorithm processing (see `Parameters <../user_guide/ICESat-2.html#parameters>`_)
asset: str
data source asset (see `Assets <../user_guide/ICESat-2.html#assets>`_)
Returns
-------
GeoDataFrame
geolocated elevations (see `Elevations <../user_guide/ICESat-2.html#elevations>`_)
"""
try:
version = ipx_region.product_version
resources = ipx_region.avail_granules(ids=True)[0]
except:
logger.critical("must supply an icepyx query as region")
Expand All @@ -74,12 +71,12 @@ def atl06p(ipx_region, parm, asset=icesat2.DEFAULT_ASSET):
if ipx_region.extent_type in ('bbox','polygon'):
parm.update({'poly': to_region(ipx_region)})

return icesat2.atl06p(parm, asset, version=version, resources=resources)
return icesat2.atl06p(parm, resources=resources)

#
# ICEPYX ATL03
#
def atl03sp(ipx_region, parm, asset=icesat2.DEFAULT_ASSET):
def atl03sp(ipx_region, parm):
"""
Performs ATL03 subsetting in parallel on ATL03 data and returns photon segment data.
Expand All @@ -89,18 +86,15 @@ def atl03sp(ipx_region, parm, asset=icesat2.DEFAULT_ASSET):
----------
ipx_region: Query
icepyx region object defining the query of granules to be processed
parms: dict
parm: dict
parameters used to configure ATL03 subsetting (see `Parameters <../user_guide/ICESat-2.html#parameters>`_)
asset: str
data source asset (see `Assets <../user_guide/ICESat-2.html#assets>`_)
Returns
-------
list
ATL03 segments (see `Photon Segments <../user_guide/ICESat-2.html#segmented-photon-data>`_)
"""
try:
version = ipx_region.product_version
resources = ipx_region.avail_granules(ids=True)[0]
except:
logger.critical("must supply an icepyx query as region")
Expand All @@ -109,7 +103,7 @@ def atl03sp(ipx_region, parm, asset=icesat2.DEFAULT_ASSET):
if ipx_region.extent_type in ('bbox','polygon'):
parm.update({'poly': to_region(ipx_region)})

return icesat2.atl03sp(parm, asset, version=version, resources=resources)
return icesat2.atl03sp(parm, resources=resources)

def to_region(ipx_region):
"""
Expand All @@ -123,7 +117,7 @@ def to_region(ipx_region):
Returns
-------
list
polygon definining region of interest (can be passed into `icesat2` api functions)
polygon defining region of interest (can be passed into `icesat2` api functions)
"""
if (ipx_region.extent_type == 'bbox'):
Expand Down
5 changes: 3 additions & 2 deletions clients/python/sliderule/sliderule.py
Original file line number Diff line number Diff line change
Expand Up @@ -233,7 +233,7 @@ def __decode_native(rectype, rawdata):
if "PTR" in flags:
continue

# get endianess
# get endianness
if "LE" in flags:
endian = '<'
else:
Expand Down Expand Up @@ -931,7 +931,8 @@ def update_available_servers (desired_nodes=None, time_to_live=None):
try:
rsps = source("status", parm={"service":"sliderule"}, path="/discovery", silence=True)
available_servers = rsps["nodes"]
except FatalError:
except FatalError as e:
logger.debug("Failed to retrieve number of nodes registered: {}".format(e))
available_servers = 0

return available_servers, requested_nodes
Expand Down
2 changes: 1 addition & 1 deletion clients/python/utils/benchmark.py
Original file line number Diff line number Diff line change
Expand Up @@ -251,7 +251,7 @@ def atl03_rasterized_subset():
return icesat2.atl03sp(parms, resources=[args.granule03])

# ------------------------------------
# Benchmark ATL03 Polgon Subset
# Benchmark ATL03 Polygon Subset
# ------------------------------------
def atl03_polygon_subset():
parms = {
Expand Down
6 changes: 3 additions & 3 deletions docs/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@ The SlideRule **website** can be built and hosted locally for development purpos
$ gem install jekyll bundler
```

Then go into the `jekyll` directory and install all depedencies.
Then go into the `jekyll` directory and install all dependencies.

```bash
$ bundle install
Expand All @@ -58,7 +58,7 @@ The SlideRule **website** can be built and hosted locally for development purpos
```

Note: docutils version 0.17.x breaks certain formatting in Sphinx (e.g. lists). Therefore it is recommended that docutils version 0.16 be installed.
2. chruby using [Homebrew](https://brew.sh/) (install if neccessary)
2. chruby using [Homebrew](https://brew.sh/) (install if necessary)

```bash
$ brew install chruby ruby-install
Expand All @@ -83,7 +83,7 @@ The SlideRule **website** can be built and hosted locally for development purpos
$ gem install jekyll
```

Then go into the `jekyll` directory and install all depedencies.
Then go into the `jekyll` directory and install all dependencies.

```bash
$ bundle install
Expand Down
10 changes: 5 additions & 5 deletions docs/rtd/source/archive/gdal_vrt_benchmark.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ Test reads elevation value from ArcticDem. POI is lon: -74.60 lat: 82.86
Method used vrt file created from mosaic rasters, version 3.0, 2m, hosted on AWS.
The mosaic.vrt file is stored locally on aws dev server at /data/ArcticDem/mosaic.vrt

The actuall raster containing the elevation for POI is:
The actual raster containing the elevation for POI is:
/vsis3/pgc-opendata-dems/arcticdem/mosaics/v3.0/2m/34_37/34_37_1_1_2m_v3.0_reg_dem.tif

The elevation can be read from the aws file directly:
Expand All @@ -19,13 +19,13 @@ gdallocationinfo -wgs84 /data/ArcticDem/mosaic.vrt -valonly -74.6 82.86


The test reads the same POI one million times. Gdal library should access the proper raster tile from S3 bucket and use it locally
for all remining elevation reads.
for all remaining elevation reads.

In the first implementation the mosaic.vrt is opened, the vrtdataset and vrtband are used to do a direct read via:
vrtband->RasterIO(GF_Read, col, row, 1, 1, &elevation, 1, 1, GDT_Float32, 0, 0, 0);
col and row are calculated for the mosaic.vrt

This aproach works correctly but performance is very poor. It took almost 170 seconds to do one million reads. Each 100k reads takes almost 17 seconds.
This approach works correctly but performance is very poor. It took almost 170 seconds to do one million reads. Each 100k reads takes almost 17 seconds.

Points read: 100000 16988.167999778
Points read: 200000 16991.582000162
Expand All @@ -41,7 +41,7 @@ Points read: 1000000 16998.240999877
1000000 points read time: 169940.77899982 msecs


In the second aproach the vrtdataset and vrtband are used ONLY to find the name of the raster containing POI.
In the second approach the vrtdataset and vrtband are used ONLY to find the name of the raster containing POI.
The code is very similar to how gdallocationinfo tool is implemented using xml parsing in GDAL.
The raster tif file is than opened and this raster's dataset and band are used to do a read using col, row calculated for that
raster. This method is much more efficient. It only took 665 msecs to do one million reads.
Expand All @@ -60,7 +60,7 @@ Points read: 1000000 66.580000100657
1000000 points read time: 665.83800013177 msecs


The test was also reapeated with POI beigng different each time read is executed but always within the same raster file. This caused
The test was also repeated with POI beigng different each time read is executed but always within the same raster file. This caused
more S3 transactions for GDAL to read many tiles. The performance degraded where the second 'fast' method took almost 2 seconds to read a million POIs which the first method took almost the same amount of time.


Expand Down
2 changes: 1 addition & 1 deletion docs/rtd/source/getting_started/Install.rst
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ To install and setup JupyterLab to run the provided example notebooks, you must
conda install -c conda-forge jupyterlab
Then make sure the conda environment with the `sliderule python` client installed in it is available to use as one of the Python kernels.
To gaurantee that JuypterLab is using the correct Python kernel, you can start JupyterLab from the conda environment with `sliderule python` installed.
To guarantee that JuypterLab is using the correct Python kernel, you can start JupyterLab from the conda environment with `sliderule python` installed.

.. code-block:: bash
Expand Down
2 changes: 1 addition & 1 deletion docs/rtd/source/getting_started/SlideRule.rst
Original file line number Diff line number Diff line change
Expand Up @@ -72,4 +72,4 @@ SlideRule is openly developed on GitHub at https://github.com/ICESat2-SlideRule.
Project Information
-------------------

The SlideRule project is funded by NASA's ICESat-2 program and is led by the University of Washington in collaboration with NASA Goddard Space Flight Center. The first public release of SlideRule ocurred in April 2021. Since then we've continued to add new services, new algorithms, and new datasets, while also making improvements to our processing architecture. Looking to the future, we hope to make SlideRule an indespensible component in the analysis of a broad array of Earth Science datasets that help us better understand the planet we call home.
The SlideRule project is funded by NASA's ICESat-2 program and is led by the University of Washington in collaboration with NASA Goddard Space Flight Center. The first public release of SlideRule occurred in April 2021. Since then we've continued to add new services, new algorithms, and new datasets, while also making improvements to our processing architecture. Looking to the future, we hope to make SlideRule an indispensable component in the analysis of a broad array of Earth Science datasets that help us better understand the planet we call home.
3 changes: 1 addition & 2 deletions docs/rtd/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -72,7 +72,7 @@ SlideRule is openly developed on GitHub at https://github.com/ICESat2-SlideRule.
Project Information
-------------------

The SlideRule project is funded by NASA's ICESat-2 program and is led by the University of Washington in collaboration with NASA Goddard Space Flight Center. The first public release of SlideRule ocurred in April 2021. Since then we've continued to add new services, new algorithms, and new datasets, while also making improvements to our processing architecture. Looking to the future, we hope to make SlideRule an indespensible component in the analysis of a broad array of Earth Science datasets that help us better understand the planet we call home.
The SlideRule project is funded by NASA's ICESat-2 program and is led by the University of Washington in collaboration with NASA Goddard Space Flight Center. The first public release of SlideRule occurred in April 2021. Since then we've continued to add new services, new algorithms, and new datasets, while also making improvements to our processing architecture. Looking to the future, we hope to make SlideRule an indispensable component in the analysis of a broad array of Earth Science datasets that help us better understand the planet we call home.



Expand All @@ -96,7 +96,6 @@ The SlideRule project is funded by NASA's ICESat-2 program and is led by the Uni
user_guide/SlideRule.rst
user_guide/ICESat-2.rst
user_guide/GEDI.rst
user_guide/prov-sys.rst
user_guide/H5Coro.md
user_guide/GeoParquet.md
user_guide/GeoRaster.md
Expand Down
2 changes: 1 addition & 1 deletion docs/rtd/source/release_notes/release-v1-0-6.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ Version description of the v1.0.6 release of ICESat-2 SlideRule

## Known Issues

1. Consul-Exit is not working and therefore a node does not dissappear from the service group when it goes down.
1. Consul-Exit is not working and therefore a node does not disappear from the service group when it goes down.

## Getting this release

Expand Down
4 changes: 2 additions & 2 deletions docs/rtd/source/release_notes/release-v1-4-0.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ $ python3 setup.py install

* User scripts that use the Python client need to make the following updates:
- The `track` keyword argument of **atl03sp**, **atl03s**, **atl06p**, and **atl06** has moved to the `parm` dictionary
- The `block` keywork argument of **atl06p** and **atl03sp** has been removed
- The `block` keyword argument of **atl06p** and **atl03sp** has been removed

* User scripts that use the Python client should make the following updates due to deprecated functionality:
- The object returned from the **icesat2.toregion** function is now a dictionary instead of a list; the polygon should be accessed via `["poly"]` instead of with a numerical index. If the region of interest contains multiple polygons, the convex hull of those polygons is returned. For compatibility, for this version only, the returned polygon is also accessible at the `[0]` index.
Expand Down Expand Up @@ -59,7 +59,7 @@ $ python3 setup.py install

- Added `rgt`, `cycle`, and `region` request parameters [sliderule-python#27](https://github.com/ICESat2-SlideRule/sliderule-python/issues/27)

- Moved `track` request parameter from being a keyward argument to a member of the `parm` dictionary
- Moved `track` request parameter from being a keyword argument to a member of the `parm` dictionary

- The `cnf` parameter default changed to 2; the `maxi` parameter default changed to 5

Expand Down
2 changes: 1 addition & 1 deletion docs/rtd/source/release_notes/release-v1-4-2.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ Version description of the v1.4.2 release of ICESat-2 SlideRule.

- Updated YAPC algorithm to Jeff's latest specification (05-04-22). The new algorithm is the default version that runs. If the previous algorithm is desired, there is a `version` parameter which is a part of the `yapc` parameter block that can be set to "1", and the original algorithm will execute. Note: the new algorithm runs about three times faster than the original one.

- Updated internal threading hueristic in the Python client:
- Updated internal threading heuristic in the Python client:
* client will attempt to throttle the number of concurrent requests to any given processing node
* the ***max_workers*** parameter in the `atl06p` and `atl03sp` APIs has been removed; if the calling application must change the number pending requests per node, then there is a new API `sliderule.set_max_pending` that can be called.

Expand Down
2 changes: 1 addition & 1 deletion docs/rtd/source/release_notes/release-v1-5-x.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ The SlideRule system underwent a gradual architectural shift from a single publi

- Networking issues on one of the deployments [#6](https://github.com/ICESat2-SlideRule/sliderule-build-and-deploy/issues/6)

- Node restarted for no discernable reason [#8](https://github.com/ICESat2-SlideRule/sliderule-build-and-deploy/issues/8)
- Node restarted for no discernible reason [#8](https://github.com/ICESat2-SlideRule/sliderule-build-and-deploy/issues/8)

- Application Load Balancer adds two seconds of latency [#2](https://github.com/ICESat2-SlideRule/sliderule-build-and-deploy/issues/2)

Expand Down
Loading

0 comments on commit c8b39b8

Please sign in to comment.