Skip to content

Commit

Permalink
Merge pull request #78 from USEPA/62-review-edits-harmonize_pensacolarmd
Browse files Browse the repository at this point in the history
62 review edits harmonize pensacolarmd
  • Loading branch information
jbousquin authored Jul 5, 2024
2 parents b125f65 + 0adbba1 commit 054c2a5
Show file tree
Hide file tree
Showing 2 changed files with 100 additions and 75 deletions.
73 changes: 42 additions & 31 deletions .github/workflows/test_r.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -21,48 +21,59 @@ jobs:

strategy:
matrix:
os: [ubuntu-latest, macos-latest]
os: [windows-latest, macos-latest]
python-version: ['3.8', '3.9', '3.10', '3.11']
include:
- os: windows-latest
python-version: "3.9"
steps:
- uses: actions/checkout@v3
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v2
- uses: actions/checkout@v4

- name: setup python ${{ matrix.python-version }}
uses: actions/setup-python@v5
with:
python-version: ${{ matrix.python-version }}

- name: Set up R
uses: r-lib/actions/setup-r@v2
with:
r-version: 'release'

- uses: r-lib/actions/setup-pandoc@v2

- name: Update pip and install testing pkgs
run: |
python -VV
python -m pip install --upgrade pip
pip install pytest
#- uses: r-lib/actions/setup-renv@v2

# fiona doesn't have wheels for windows
- if: matrix.os == 'windows-latest'
run: |
pip install https://github.com/cgohlke/geospatial-wheels/releases/download/v2023.7.16/GDAL-3.7.1-cp39-cp39-win_amd64.whl
pip install https://github.com/cgohlke/geospatial-wheels/releases/download/v2023.7.16/Fiona-1.9.4.post1-cp39-cp39-win_amd64.whl
- name: Install package and dependencies
run: |
python -m pip install --no-deps .
pip install -r requirements.txt
- uses: r-lib/actions/setup-pandoc@v2

- name: Run pip env using R reticulate
run: |
install.packages("reticulate")
reticulate::import("harmonize_wq")
- name: R depends
shell: Rscript {0}
run: |
install.packages(c("knitr", "rmarkdown", "reticulate"))
- name: Run pytest
run: pytest -v harmonize_wq
- name: setup r-reticulate venv & render rmd
shell: Rscript {0}
run: |
library(reticulate)
packages = c(
"pytest", "numpy<2.0", "pandas<2.0", "geopandas>=0.10.2, <0.13", "pint>=0.18",
"dataretrieval>=1.0, <1.0.5", "requests"
)
reticulate::install_miniconda()
reticulate::conda_create("wq_harmonize", python_version = "${{ matrix.python-version }}")
reticulate::conda_install("wq_harmonize", packages)
#path_to_venv <- virtualenv_create(
# envname = "wq_harmonize",
# python = Sys.which("python"), # placed on PATH by the setup-python action
# packages
#)
#use_virtualenv("wq_harmonize")
reticulate::py_install("git+https://github.com/USEPA/harmonize-wq.git", pip = TRUE, envname = "wq_harmonize")
rmarkdown::render(input = "demos/Harmonize_Pensacola.Rmd")
- name: Upload artifact
if: ${{ (matrix.os == 'windows-latest') && (matrix.python-version == 3.11) }}
uses: actions/upload-artifact@v4
with:
name: demos-artifact
# Upload entire demos folder
path: './demos'
102 changes: 58 additions & 44 deletions demos/Harmonize_Pensacola.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -55,6 +55,13 @@ data comparability and any thresholds for acceptance or rejection.

## Installation & Setup

Using R/reticulate requires an installation of Python to bind to, and EPA's harmonize-wq package must be run from a python environment with the packages it depends on.

For environment management, reticulate requires either conda (recommended) or virtualenv.

There are multiple installers available for [Conda](https://conda.io/projects/conda/en/latest/user-guide/install/index.html).
The examples use miniforge3, one of several verions of [miniforge](https://github.com/conda-forge/miniforge). In R, [miniconda](https://docs.anaconda.com/miniconda/) can be installed using reticulate::install_miniconda().

#### Option 1: Install the harmonize-wq Package Using the Command Line

To install and set up the harmonize-wq package using the command line:
Expand All @@ -64,82 +71,86 @@ To install and set up the harmonize-wq package using the command line:
miniforge is installed. Go to your start menu and open the Miniforge
Prompt.
2. At the Miniforge Prompt, run:
- to update conda:

> conda update -n base -c conda-forge conda
- conda create --name wq_harmonize
- activate wq_harmonize
- conda install geopandas pip dataretrieval pint
- may need to update conda
- conda update -n base -c conda-forge conda
- pip install harmonize-wq
- pip install git+<https://github.com/USEPA/harmonize-wq.git> (dev
version)

- conda activate wq_harmonize

- conda install dependencies (from requirements.txt):

> conda install "numpy<2.0" "pandas<2.0" "geopandas>=0.10.2, <0.13" "pint>=0.18" "dataretrieval>=1.0, <1.0.5" "pip"
- pip install harmonize-wq (dev-version shown):

> pip install git+<https://github.com/USEPA/harmonize-wq.git>
#### Option 2: Install the harmonize-wq Package Using R

**Alternatively**, you may be able to set up your environment and import
the required Python packages using R.

First, run the chunk below to install the reticulate package to use Python in R.
First, run the chunk below to install the reticulate package to use
Python in R.

```{r, results = 'hide'}
install.packages("reticulate")
install.packages("reticulate", repos = "http://cran.us.r-project.org")
library(reticulate)
```

Conda is required to use EPA's harmonize-wq package.

There are multiple installers available for Conda
(see: <https://conda.io/projects/conda/en/latest/user-guide/install/index.html>).

One example installer is
[miniforge](https://github.com/conda-forge/miniforge). We use miniforge3 in this
example.

Once miniforge3 (or another installer of your choice) is installed, the
reticulate package will automatically look for the installation of Conda (conda.exe)
on your computer.
Once miniforge3 (or another Conda installer of your choice) is
installed, the reticulate package will automatically look for the
installation of Conda (conda.exe) on your computer.

```{r, results = 'hide'}
# options(reticulate.conda_binary = 'dir')
```

However, you may still need to specify the location. If needed, update the code chuck below to specify the location of conda.exe on your computer.
However, you may still need to specify the conda.exe location. To do so,
update the last line of code below to specify your conda.exe location

```{r, results = 'hide'}
# update the 'dir' in this chuck to specify the location of conda.exe on your computer
# Note that the environment name may need to include the full path (e.g. "C:/Users/USERNAME/AppData/Local/miniforge3/Scripts/conda.exe")
options(reticulate.conda_binary = "~/AppData/Local/miniforge3/Scripts/conda.exe")
# Note: that the environment name may need to include the full path (e.g. "C:/Users/<USERNAME>/AppData/Local/miniforge3/Scripts/conda.exe")
# options(reticulate.conda_binary = "~/AppData/Local/miniforge3/Scripts/conda.exe")
```

Next, update the code chunk below to create a new Python environment in the envs
folder on your computer called "wq_harmonize".
Next, update the code chunk below to create a new Python environment in
the envs folder called "wq_harmonize". Note that the environment name
may need to include the full path (e.g.
"C:/Users/<USERNAME>/AppData/Local/miniforge3/envs/wq_harmonize")

```{r, results = 'hide'}
# Note that the environment name may need to include the full path (e.g. "C:/Users/USERNAME/AppData/Local/miniforge3/envs/wq_harmonize")
reticulate::conda_create("~/AppData/Local/miniforge3/envs/wq_harmonize")
# reticulate::conda_create("~/AppData/Local/miniforge3/envs/wq_harmonize")
```

Install the following python and R packages to the newly created
Python environment called "wq_harmonize".
Install the following python packages to the newly created Python
environment called "wq_harmonize".

```{r, results = 'hide'}
reticulate::conda_install("wq_harmonize", "geopandas") # Python package
reticulate::conda_install("wq_harmonize", "pint") # Python package
reticulate::conda_install("wq_harmonize", "dataretrieval") # R package
# packages = c(
# "numpy<2.0", "pandas<2.0", "geopandas>=0.10.2, <0.13", "pint>=0.18",
# "dataretrieval>=1.0, <1.0.5", "pip"
# )
# reticulate::conda_install("wq_harmonize", packages)
```

Install EPA's harmonize-wq package.
Uncomment to install EPA's harmonize-wq package most recent release or
development version.

```{r, results = 'hide'}
# Install the most recent release of the harmonize-wq package
# This only works with py_install() (pip = TRUE), which defaults to use virtualenvs
reticulate::py_install("harmonize-wq", pip = TRUE, envname = "wq_harmonize")
# reticulate::py_install("harmonize-wq", pip = TRUE, envname = "wq_harmonize")
# Uncomment below to install the development version of harmonize-wq from GitHub instead (optional)
# py_install("git+https://github.com/USEPA/harmonize-wq.git@new_release_0-3-8", pip = TRUE, envname = "wq_harmonize")
# Install the development version of harmonize-wq from GitHub (optional)
#py_install("git+https://github.com/USEPA/harmonize-wq.git", pip = TRUE, envname = "wq_harmonize")
```

Specify the Python environment to be used, "wq_harmonize", and test that your Python
environment is set up correctly.
Specify the Python environment to be used, "wq_harmonize", and test that
your Python environment is set up correctly.

```{r}
# Specify environment to be used
Expand Down Expand Up @@ -167,6 +178,7 @@ import geopandas
import dataretrieval.wqp as wqp
import pint
import mapclassify
import matplotlib.pyplot as plt
from harmonize_wq import harmonize
from harmonize_wq import convert
from harmonize_wq import wrangle
Expand All @@ -190,8 +202,8 @@ reticulate::repl_python()
```

First, determine an area of interest (AOI), build a query, and retrieve
water temperature and Secchi disk depth data from the Water Quality Portal (WQP)
for the AOI using the dataretrieval package:
water temperature and Secchi disk depth data from the Water Quality
Portal (WQP) for the AOI using the dataretrieval package:

```{python, error = F}
# File for area of interest (Pensacola and Perdido Bays, FL)
Expand All @@ -211,11 +223,11 @@ res_narrow, md_narrow = wqp.get_results(**query)
res_narrow
```

Next, harmonize and clean all results using the harmonize.harmonize_all,
Next, harmonize and clean all results using the harmonize.harmonize_all,
clean.datetime, and clean.harmonize_depth functions.

Enter a ? followed by the function name, for example ?harmonize.harmonize_all,
into the console for more details.
Enter a ? followed by the function name, for example
?harmonize.harmonize_all, into the console for more details.

```{python, error = F}
df_harmonized = harmonize.harmonize_all(res_narrow, errors = 'raise')
Expand Down Expand Up @@ -257,5 +269,7 @@ stations_gdf, stations, site_md = location.get_harmonized_stations(query, aoi=ao
# Map average temperature results at each station
gdf_temperature = visualize.map_measure(df_wide, stations_gdf, 'Temperature')
plt.figure()
gdf_temperature.plot(column = 'mean', cmap = 'OrRd', legend = True)
plt.show()
```

0 comments on commit 054c2a5

Please sign in to comment.