Skip to content

Commit

Permalink
Merge branch 'main' of https://github.com/wri/cities-cif into add_eco…
Browse files Browse the repository at this point in the history
…nomicopportunity_to_osm
  • Loading branch information
weiqi-tori committed Aug 28, 2024
2 parents 7fb392c + 5b28121 commit f9a3650
Show file tree
Hide file tree
Showing 39 changed files with 812 additions and 224 deletions.
2 changes: 0 additions & 2 deletions .devcontainer/Dockerfile

This file was deleted.

20 changes: 20 additions & 0 deletions .github/requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
earthengine-api==0.1.408
geocube==0.4.2
geopandas==0.14.1
rioxarray==0.15.0
odc-stac==0.3.8
pystac-client==0.7.5
pytest==7.4.3
xarray-spatial==0.3.7
xee==0.0.3
utm==0.7.0
osmnx==1.9.3
dask[complete]==2023.11.0
matplotlib==3.8.2
s3fs==2024.5.0
geemap==0.32.0
pip==23.3.1
boto3==1.34.124
scikit-learn==1.5.0
overturemaps==0.6.0
git+https://github.com/isciences/exactextract
39 changes: 39 additions & 0 deletions .github/workflows/build-image.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
name: build-image
on:
workflow_dispatch:
jobs:
build-image:
name: build-image
runs-on: ubuntu-22.04
steps:
- name: Clean up Ubuntu
run: |
sudo rm -rf /usr/share/dotnet
sudo rm -rf /opt/ghc
sudo rm -rf "/usr/local/share/boost"
sudo rm -rf "$AGENT_TOOLSDIRECTORY"
- name: Checkout
uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Get VERSION
run: echo "VERSION=$(cat VERSION)" >> $GITHUB_ENV
- name: Build Image
id: build-image
uses: redhat-actions/buildah-build@v2
with:
image: wri-cities-cif-environment
tags: latest ${{ env.VERSION }} ${{ github.sha }}
containerfiles: |
./container/Containerfile
- name: Push image to container registry
id: push-image-to-registry
uses: redhat-actions/push-to-registry@v2
with:
image: ${{ steps.build-image.outputs.image }}
tags: ${{ steps.build-image.outputs.tags }}
registry: ghcr.io/wri
username: ${{ secrets.REGISTRY_USER }}
password: ${{ secrets.REGISTRY_PASSWORD }}
- name: Print image url
run: echo "Image pushed to ${{ steps.push-image-to-registry.outputs.registry-paths }}"
36 changes: 36 additions & 0 deletions .github/workflows/dev_ci_cd.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
name: Dev CIF API CI/CD

on:
pull_request:

permissions:
contents: read
jobs:
build:
runs-on: ubuntu-latest
strategy:
max-parallel: 4
matrix:
python-version: ["3.10"]

steps:
- uses: actions/checkout@v3
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v3
with:
python-version: ${{ matrix.python-version }}
- name: Install Linux dependencies
run: |
sudo apt update
sudo apt install -y gdal-bin libgdal-dev
- name: Install Packages
run: |
python -m pip install --upgrade pip
pip install -r .github/requirements.txt
pip install GDAL==`gdal-config --version`
- name: Run Tests
env:
GOOGLE_APPLICATION_USER: ${{ secrets.GOOGLE_APPLICATION_USER }}
GOOGLE_APPLICATION_CREDENTIALS: ${{ secrets.GOOGLE_APPLICATION_CREDENTIALS }}
run: |
pytest tests
34 changes: 0 additions & 34 deletions .github/workflows/tests.yml

This file was deleted.

45 changes: 31 additions & 14 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,45 +3,61 @@
The Cities Indicator Framework (CIF) is a set of Python tools to make it easier to calculate zonal statistics for cities by providing a standardized set of data layers for inputs and a common framework for using those layers to calculate indicators.

## Quick start

* If all you want to do is use the CIF, the quickest way to get started is to use our [WRI Cities Indicator Framework Colab Notebook](https://colab.research.google.com/drive/1PV1H-godxJ6h42p74Ij9sdFh3T0RN-7j#scrollTo=eM14UgpmpZL-)

## Installation
* `pip install git+https://github.com/wri/[email protected]` to install a specific version.
* `pip install git+https://github.com/wri/cities-cif/releases/latest` gives you the latest stable release.
* `pip install git+https://github.com/wri/cities-cif` gives you the main branch with is not stable.

## PR Review
0. Prerequisites
1. Git
* On Windows I recommend WSL https://learn.microsoft.com/en-us/windows/wsl/tutorials/wsl-git
3. https://cli.github.com/
* On MacOS I recommend the Homebrew option
* If you don't have an ssh key, it will install one for you
4. Conda (or Mamba) to install dependencies
* If you have Homebrew `brew install --cask miniconda`

0. Prerequisites
1. Git
* On Windows I recommend WSL [https://learn.microsoft.com/en-us/windows/wsl/tutorials/wsl-git](https://learn.microsoft.com/en-us/windows/wsl/tutorials/wsl-git)
2. [https://cli.github.com/](https://cli.github.com/)
* On MacOS I recommend the Homebrew option
* If you don't have an ssh key, it will install one for you
3. Conda (or Mamba) to install dependencies
* If you have Homebrew `brew install --cask miniconda`

## Dependencies

There are 2 ways to install dependencies. Choose one...

### Conda

`conda env create -f environment.yml`

### Setuptools

`python setup.py`
NOTE: If you are using this method you may want to use something like pyenv to manage Python environments

## Credentials
To run the module, you need access to Google Earth Engine.

To run the module,

1. You need access to Google Earth Engine
2. Install <https://cloud.google.com/sdk/docs/install>

### Interactive development

For most people working in a notebook or IDE the script should walk you thourgh an interactive authentication process. You will just need to be logged in to your Google account that has access to GEE in your browser.

### Programatic access

If you have issues with this or need to run the script as part of an automated workflow we have a GEE-enabled GCP service account that can be used. Get in touch with Saif or Chris to ask about getting the credetials.

Set the following environment variables:
- GOOGLE_APPLICATION_CREDENTIALS: The path of GCP credentials JSON file containing your private key.
- GOOGLE_APPLICATION_USER: The email for your GCP user.
- GCS_BUCKET: The GCS bucket to read and write data from.

* GOOGLE_APPLICATION_CREDENTIALS: The path of GCP credentials JSON file containing your private key.
* GOOGLE_APPLICATION_USER: The email for your GCP user.

For example, you could set the following in your `~/.zshrc` file:

```
export GCS_BUCKET=gee-exports
export GOOGLE_APPLICATION_USER=developers@citiesindicators.iam.gserviceaccount.com
export GOOGLE_APPLICATION_CREDENTIALS=/path/to/credentials/file
```
Expand All @@ -50,5 +66,6 @@ export GOOGLE_APPLICATION_CREDENTIALS=/path/to/credentials/file

All are welcome to contribute by creating a [Pull Request](https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/proposing-changes-to-your-work-with-pull-requests/about-pull-requests). We try to follow the [Github Flow](https://docs.github.com/en/get-started/quickstart/github-flow) workflow.

See the [developer docs](docs/developer.md) to learn more about how to add data layers and indicators.
See [PR Review](docs/pr_review.md) for more details and options on how to review a PR.

See the [developer docs](docs/developer.md) to learn more about how to add data layers and indicators.
31 changes: 21 additions & 10 deletions city_metrix/__init__.py
Original file line number Diff line number Diff line change
@@ -1,26 +1,37 @@
from .metrics import *
import os
import ee
import warnings

import ee

from .metrics import *

# initialize ee
if "GOOGLE_APPLICATION_CREDENTIALS" in os.environ and "GOOGLE_APPLICATION_USER" in os.environ:
if (
"GOOGLE_APPLICATION_CREDENTIALS" in os.environ
and "GOOGLE_APPLICATION_USER" in os.environ
):
print("Authenticating to GEE with configured credentials file.")
CREDENTIAL_FILE = os.environ["GOOGLE_APPLICATION_CREDENTIALS"]
GEE_SERVICE_ACCOUNT = os.environ["GOOGLE_APPLICATION_USER"]
auth = ee.ServiceAccountCredentials(GEE_SERVICE_ACCOUNT, CREDENTIAL_FILE)
ee.Initialize(auth, opt_url='https://earthengine-highvolume.googleapis.com')
if CREDENTIAL_FILE.endswith(".json"):
auth = ee.ServiceAccountCredentials(
GEE_SERVICE_ACCOUNT, key_file=CREDENTIAL_FILE
)
else:
auth = ee.ServiceAccountCredentials(
GEE_SERVICE_ACCOUNT, key_data=CREDENTIAL_FILE
)
ee.Initialize(auth, opt_url="https://earthengine-highvolume.googleapis.com")
else:
print("Could not find GEE credentials file, so prompting authentication.")
ee.Authenticate()
ee.Initialize(opt_url='https://earthengine-highvolume.googleapis.com')
ee.Initialize(opt_url="https://earthengine-highvolume.googleapis.com")


# set for AWS requests
os.environ["AWS_REQUEST_PAYER"] = "requester"

# disable warning messages
warnings.filterwarnings('ignore', module='xee')
warnings.filterwarnings('ignore', module='dask')
warnings.filterwarnings('ignore', module='xarray')

warnings.filterwarnings("ignore", module="xee")
warnings.filterwarnings("ignore", module="dask")
warnings.filterwarnings("ignore", module="xarray")
1 change: 1 addition & 0 deletions city_metrix/layers/__init__.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
from .albedo import Albedo
from .ndvi_sentinel2_gee import NdviSentinel2
from .esa_world_cover import EsaWorldCover, EsaWorldCoverClass
from .land_surface_temperature import LandSurfaceTemperature
from .tree_cover import TreeCover
Expand Down
2 changes: 1 addition & 1 deletion city_metrix/layers/albedo.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ def __init__(self, start_date="2021-01-01", end_date="2022-01-01", threshold=Non
self.threshold = threshold

def get_data(self, bbox):
S2 = ee.ImageCollection("COPERNICUS/S2_SR")
S2 = ee.ImageCollection("COPERNICUS/S2_SR_HARMONIZED")
S2C = ee.ImageCollection("COPERNICUS/S2_CLOUD_PROBABILITY")

MAX_CLOUD_PROB = 30
Expand Down
5 changes: 0 additions & 5 deletions city_metrix/layers/high_land_surface_temperature.py
Original file line number Diff line number Diff line change
Expand Up @@ -54,8 +54,3 @@ def addDate(image):

# convert to date object
return datetime.datetime.strptime(hottest_date, "%Y%m%d").date()

def write(self, output_path):
self.data.rio.to_raster(output_path)


5 changes: 0 additions & 5 deletions city_metrix/layers/land_surface_temperature.py
Original file line number Diff line number Diff line change
Expand Up @@ -33,8 +33,3 @@ def apply_scale_factors(image):

data = get_image_collection(ee.ImageCollection(l8_st), bbox, 30, "LST").ST_B10_mean
return data

def write(self, output_path):
self.data.rio.to_raster(output_path)


5 changes: 2 additions & 3 deletions city_metrix/layers/landsat_collection_2.py
Original file line number Diff line number Diff line change
Expand Up @@ -29,8 +29,7 @@ def get_data(self, bbox):
fail_on_error=False,
)

# TODO: Determine how to output xarray

qa_lst = lc2.where((lc2.qa_pixel & 24) == 0)
return qa_lst.drop_vars("qa_pixel")



32 changes: 16 additions & 16 deletions city_metrix/layers/layer.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,10 +18,8 @@
import shapely.geometry as geometry
import pandas as pd


MAX_TILE_SIZE = 0.5


class Layer:
def __init__(self, aggregate=None, masks=[]):
self.aggregate = aggregate
Expand Down Expand Up @@ -56,7 +54,7 @@ def groupby(self, zones, layer=None):
"""
return LayerGroupBy(self.aggregate, zones, layer, self.masks)

def write(self, bbox, output_path, tile_degrees=None):
def write(self, bbox, output_path, tile_degrees=None, **kwargs):
"""
Write the layer to a path. Does not apply masks.
Expand Down Expand Up @@ -301,21 +299,23 @@ def get_image_collection(

return data


def write_layer(path, data):
if isinstance(data, xr.DataArray):
# for rasters, need to write to locally first then copy to cloud storage
if path.startswith("s3://"):
tmp_path = f"{uuid4()}.tif"
data.rio.to_raster(raster_path=tmp_path, driver="COG")

s3 = boto3.client('s3')
s3.upload_file(tmp_path, path.split('/')[2], '/'.join(path.split('/')[3:]))

os.remove(tmp_path)
else:
data.rio.to_raster(raster_path=path, driver="COG")
write_dataarray(path, data)
elif isinstance(data, gpd.GeoDataFrame):
data.to_file(path, driver="GeoJSON")
else:
raise NotImplementedError("Can only write DataArray or GeoDataFrame")
raise NotImplementedError("Can only write DataArray, Dataset, or GeoDataFrame")

def write_dataarray(path, data):
# for rasters, need to write to locally first then copy to cloud storage
if path.startswith("s3://"):
tmp_path = f"{uuid4()}.tif"
data.rio.to_raster(raster_path=tmp_path, driver="COG")

s3 = boto3.client('s3')
s3.upload_file(tmp_path, path.split('/')[2], '/'.join(path.split('/')[3:]))

os.remove(tmp_path)
else:
data.rio.to_raster(raster_path=path, driver="COG")
Loading

0 comments on commit f9a3650

Please sign in to comment.