-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #1 from Vizzuality/add-science-project
Add data pipeline with kedro
- Loading branch information
Showing
40 changed files
with
268,970 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,151 @@ | ||
########################## | ||
# KEDRO PROJECT | ||
|
||
# ignore all local configuration | ||
conf/local/** | ||
!conf/local/.gitkeep | ||
|
||
# ignore potentially sensitive credentials files | ||
conf/**/*credentials* | ||
|
||
# ignore everything in the following folders | ||
data/** | ||
|
||
# except their sub-folders | ||
!data/**/ | ||
|
||
# also keep all .gitkeep files | ||
!.gitkeep | ||
|
||
# keep also the example dataset | ||
!data/01_raw/* | ||
|
||
|
||
########################## | ||
# Common files | ||
|
||
# IntelliJ | ||
.idea/ | ||
*.iml | ||
out/ | ||
.idea_modules/ | ||
|
||
### macOS | ||
*.DS_Store | ||
.AppleDouble | ||
.LSOverride | ||
.Trashes | ||
|
||
# Vim | ||
*~ | ||
.*.swo | ||
.*.swp | ||
|
||
# emacs | ||
*~ | ||
\#*\# | ||
/.emacs.desktop | ||
/.emacs.desktop.lock | ||
*.elc | ||
|
||
# JIRA plugin | ||
atlassian-ide-plugin.xml | ||
|
||
# C extensions | ||
*.so | ||
|
||
### Python template | ||
# Byte-compiled / optimized / DLL files | ||
__pycache__/ | ||
*.py[cod] | ||
*$py.class | ||
|
||
# Distribution / packaging | ||
.Python | ||
build/ | ||
develop-eggs/ | ||
dist/ | ||
downloads/ | ||
eggs/ | ||
.eggs/ | ||
lib/ | ||
lib64/ | ||
parts/ | ||
sdist/ | ||
var/ | ||
wheels/ | ||
*.egg-info/ | ||
.installed.cfg | ||
*.egg | ||
MANIFEST | ||
|
||
# PyInstaller | ||
# Usually these files are written by a python script from a template | ||
# before PyInstaller builds the exe, so as to inject date/other infos into it. | ||
*.manifest | ||
*.spec | ||
|
||
# Installer logs | ||
pip-log.txt | ||
pip-delete-this-directory.txt | ||
|
||
# Unit test / coverage reports | ||
htmlcov/ | ||
.tox/ | ||
.coverage | ||
.coverage.* | ||
.cache | ||
nosetests.xml | ||
coverage.xml | ||
*.cover | ||
.hypothesis/ | ||
|
||
# Translations | ||
*.mo | ||
*.pot | ||
|
||
# Django stuff: | ||
*.log | ||
.static_storage/ | ||
.media/ | ||
local_settings.py | ||
|
||
# Flask stuff: | ||
instance/ | ||
.webassets-cache | ||
|
||
# Scrapy stuff: | ||
.scrapy | ||
|
||
# Sphinx documentation | ||
docs/_build/ | ||
|
||
# PyBuilder | ||
target/ | ||
|
||
# Jupyter Notebook | ||
.ipynb_checkpoints | ||
|
||
# pyenv | ||
.python-version | ||
|
||
# celery beat schedule file | ||
celerybeat-schedule | ||
|
||
# SageMath parsed files | ||
*.sage.py | ||
|
||
# Environments | ||
.env | ||
.venv | ||
env/ | ||
venv/ | ||
ENV/ | ||
env.bak/ | ||
venv.bak/ | ||
|
||
# mkdocs documentation | ||
/site | ||
|
||
# mypy | ||
.mypy_cache/ |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,17 @@ | ||
repos: | ||
- repo: https://github.com/pre-commit/pre-commit-hooks | ||
rev: v2.3.0 | ||
hooks: | ||
- id: check-yaml | ||
- id: end-of-file-fixer | ||
- id: trailing-whitespace | ||
|
||
- repo: https://github.com/astral-sh/ruff-pre-commit | ||
rev: v0.1.4 | ||
hooks: | ||
- id: ruff | ||
args: [ --fix ] | ||
types_or: [ python, pyi, jupyter ] | ||
|
||
- id: ruff-format | ||
types_or: [ python, pyi, jupyter ] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
consent: false |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,70 @@ | ||
# Digital Twins visual communication data processing | ||
|
||
## Overview | ||
|
||
This is your new Kedro project with Kedro-Viz setup, which was generated using `kedro 0.19.6`. | ||
|
||
Take a look at the [Kedro documentation](https://docs.kedro.org) to get started. | ||
|
||
## Pipelines | ||
|
||
The project contains one pipeline for now: `globe` | ||
|
||
### `lowvshigh` | ||
|
||
Pipeline to generate the comparisson between low and high resolution simulations. Currently it has: | ||
|
||
- splits nextgems global datasets into a set of tiffs (one per timestep) to use in blender to render a rotating globe. | ||
- video generation pipeline for a regions defined in `conf/parameters.yml` | ||
|
||
|
||
## How to install dependencies | ||
|
||
Declare any dependencies in `requirements.txt` for `pip` installation. | ||
|
||
To install them, run: | ||
|
||
``` | ||
pip install -r requirements.txt | ||
``` | ||
|
||
## How to run your Kedro pipeline | ||
|
||
You can run your Kedro project with: | ||
|
||
``` | ||
kedro run | ||
``` | ||
I recomend use the `ParallelRunner` to run the nodes in parallel | ||
|
||
``` | ||
kedro run --runner=ParallelRunner | ||
``` | ||
|
||
### Run a subset of the pipeline | ||
|
||
Kedro allows run subsets by selecting only nodes, pipelines or tags. Check the tags in the pipeline code or in kedro viz. | ||
For example to run only the detailed videos pipelines use | ||
|
||
``` | ||
kedro run --runner=ParallelRunner --tags zoomin | ||
``` | ||
|
||
|
||
## Kedro viz | ||
|
||
Visualize the pipeline with | ||
|
||
``` | ||
kedro viz | ||
``` | ||
|
||
|
||
## Rules and guidelines | ||
|
||
In order to get the best out of the template: | ||
|
||
* Don't remove any lines from the `.gitignore` file we provide | ||
* Make sure your results can be reproduced by following a [data engineering convention](https://docs.kedro.org/en/stable/faq/faq.html#what-is-data-engineering-convention) | ||
* Don't commit data to your repository | ||
* Don't commit any credentials or your local configuration to your repository. Keep all your credentials and local configuration in `conf/local/` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,20 @@ | ||
# What is this for? | ||
|
||
This folder should be used to store configuration files used by Kedro or by separate tools. | ||
|
||
This file can be used to provide users with instructions for how to reproduce local configuration with their own credentials. You can edit the file however you like, but you may wish to retain the information below and add your own section in the section titled **Instructions**. | ||
|
||
## Local configuration | ||
|
||
The `local` folder should be used for configuration that is either user-specific (e.g. IDE configuration) or protected (e.g. security keys). | ||
|
||
> *Note:* Please do not check in any local configuration to version control. | ||
## Base configuration | ||
|
||
The `base` folder is for shared configuration, such as non-sensitive and project-related configuration that may be shared across team members. | ||
|
||
WARNING: Please do not put access credentials in the base configuration folder. | ||
|
||
## Find out more | ||
You can find out more about configuration from the [user guide documentation](https://docs.kedro.org/en/stable/configuration/configuration_basics.html). |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,106 @@ | ||
# ============== WINDSPEED ================ | ||
|
||
wind_speed_global_100km.raw: | ||
type: kedro_datasets_experimental.netcdf.NetCDFDataset | ||
filepath: data/01_raw/nextgems/ws_global_100km.nc | ||
|
||
wind_speed_global_10km.raw: | ||
type: kedro_datasets_experimental.netcdf.NetCDFDataset | ||
filepath: data/01_raw/nextgems/ws_global_10km.nc | ||
|
||
wind_speed_global_100km.parts: | ||
type: partitions.PartitionedDataset | ||
path: data/03_primary/ws-100-parts | ||
dataset: | ||
type: kedro_datasets_experimental.rioxarray.GeoTIFFDataset | ||
save_args: | ||
compress: zstd | ||
filename_suffix: ".tif" | ||
|
||
wind_speed_global_10km.parts: | ||
type: partitions.PartitionedDataset | ||
path: data/03_primary/ws-10-parts | ||
dataset: | ||
type: kedro_datasets_experimental.rioxarray.GeoTIFFDataset | ||
save_args: | ||
compress: zstd | ||
filename_suffix: ".tif" | ||
|
||
|
||
|
||
|
||
# ============== CLOUD COVER ================ | ||
|
||
|
||
cloud_cover_10km.raw: | ||
type: kedro_datasets_experimental.netcdf.NetCDFDataset | ||
filepath: data/01_raw/nextgems/lcc_global_10km.nc | ||
|
||
cloud_cover_100km.raw: | ||
type: kedro_datasets_experimental.netcdf.NetCDFDataset | ||
filepath: data/01_raw/nextgems/lcc_global_100km.nc | ||
|
||
|
||
cloud_cover_10km.parts: | ||
type: partitions.PartitionedDataset | ||
path: data/02_intermediate/amazonia-10-parts | ||
dataset: | ||
type: kedro_datasets_experimental.rioxarray.GeoTIFFDataset | ||
save_args: | ||
compress: zstd | ||
filename_suffix: ".tif" | ||
|
||
cloud_cover_100km.parts: | ||
type: partitions.PartitionedDataset | ||
path: data/02_intermediate/amazonia-100-parts | ||
dataset: | ||
type: kedro_datasets_experimental.rioxarray.GeoTIFFDataset | ||
save_args: | ||
compress: zstd | ||
filename_suffix: ".tif" | ||
|
||
cloud_cover_10km.video: | ||
type: video.VideoDataset | ||
filepath: data/03_primary/cloud_cover_10km.mp4 | ||
|
||
cloud_cover_100km.video: | ||
type: video.VideoDataset | ||
filepath: data/03_primary/cloud_cover_100km.mp4 | ||
|
||
# ============== PRECIPITATION ================ | ||
|
||
|
||
total_precipitation_10km.raw: | ||
type: kedro_datasets_experimental.netcdf.NetCDFDataset | ||
filepath: data/01_raw/nextgems/tp_global_10km.nc | ||
|
||
total_precipitation_100km.raw: | ||
type: kedro_datasets_experimental.netcdf.NetCDFDataset | ||
filepath: data/01_raw/nextgems/tp_global_100km.nc | ||
|
||
total_precipitation_10km.parts: | ||
type: partitions.PartitionedDataset | ||
path: data/02_intermediate/hurricane-10-parts | ||
dataset: | ||
type: kedro_datasets_experimental.rioxarray.GeoTIFFDataset | ||
save_args: | ||
compress: zstd | ||
filename_suffix: ".tif" | ||
|
||
total_precipitation_100km.parts: | ||
type: partitions.PartitionedDataset | ||
path: data/02_intermediate/hurricane-100-parts | ||
dataset: | ||
type: kedro_datasets_experimental.rioxarray.GeoTIFFDataset | ||
save_args: | ||
compress: zstd | ||
filename_suffix: ".tif" | ||
|
||
|
||
total_precipitation_10km.video: | ||
type: video.VideoDataset | ||
filepath: data/03_primary/tp_global_10km.mp4 | ||
|
||
total_precipitation_100km.video: | ||
type: video.VideoDataset | ||
filepath: data/03_primary/tp_global_100km.mp4 |
Oops, something went wrong.