updates data folder struc and improve readme

Vizzuality · Jul 26, 2024 · 4bc7e3d · 4bc7e3d
1 parent 6b98fd5
commit 4bc7e3d
Show file tree

Hide file tree

Showing 12 changed files with 662 additions and 62 deletions.
diff --git a/science/README.md b/science/README.md
@@ -10,20 +10,13 @@ Take a look at the [Kedro documentation](https://docs.kedro.org) to get started.
 
 The project contains one pipeline for now: `globe`
 
-### `global`
+### `lowvshigh`
 
-Pipeline to split nextgems global datasets (low and high resolution) into a set of tiffs (one per timestep) to use in blender to render a rotating globe.
+Pipeline to generate the comparisson between low and high resolution simulations. Currently it has:
 
-### 
+- splits nextgems global datasets into a set of tiffs (one per timestep) to use in blender to render a rotating globe.
+- video generation pipeline for a regions defined in `conf/parameters.yml`
 
-## Rules and guidelines
-
-In order to get the best out of the template:
-
-* Don't remove any lines from the `.gitignore` file we provide
-* Make sure your results can be reproduced by following a [data engineering convention](https://docs.kedro.org/en/stable/faq/faq.html#what-is-data-engineering-convention)
-* Don't commit data to your repository
-* Don't commit any credentials or your local configuration to your repository. Keep all your credentials and local configuration in `conf/local/`
 
 ## How to install dependencies
 
@@ -42,51 +35,36 @@ You can run your Kedro project with:
 ```
 kedro run
 ```
-
-## How to test your Kedro project
-
-Have a look at the files `src/tests/test_run.py` and `src/tests/pipelines/data_science/test_pipeline.py` for instructions on how to write your tests. Run the tests as follows:
+I recomend use the `ParallelRunner` to run the nodes in parallel 
 
 ```
-pytest
+kedro run --runner=ParallelRunner
 ```
 
-To configure the coverage threshold, look at the `.coveragerc` file.
-
-## Project dependencies
-
-To see and update the dependency requirements for your project use `requirements.txt`. Install the project requirements with `pip install -r requirements.txt`.
+### Run a subset of the pipeline
 
-[Further information about project dependencies](https://docs.kedro.org/en/stable/kedro_project_setup/dependencies.html#project-specific-dependencies)
-
-## How to work with Kedro and notebooks
-
-> Note: Using `kedro jupyter` or `kedro ipython` to run your notebook provides these variables in scope: `catalog`, `context`, `pipelines` and `session`.
->
-> Jupyter, JupyterLab, and IPython are already included in the project requirements by default, so once you have run `pip install -r requirements.txt` you will not need to take any extra steps before you use them.
-
-### Jupyter
+Kedro allows run subsets by selecting only nodes, pipelines or tags. Check the tags in the pipeline code or in kedro viz.
+For example to run only the detailed videos pipelines use
 
 ```
-kedro jupyter notebook
+kedro run --runner=ParallelRunner --tags zoomin
 ```
 
-You can also start JupyterLab:
 
-```
-kedro jupyter lab
-```
+## Kedro viz
 
-### IPython
-And if you want to run an IPython session:
+Visualize the pipeline with
 
 ```
-kedro ipython
+kedro viz
 ```
 
-### How to ignore notebook output cells in `git`
-To automatically strip out all output cell contents before committing to `git`, you can use tools like [`nbstripout`](https://github.com/kynan/nbstripout). For example, you can add a hook in `.git/config` with `nbstripout --install`. This will run `nbstripout` before anything is committed to `git`.
 
-> *Note:* Your output cells will be retained locally.
+## Rules and guidelines
 
-[Further information about using notebooks for experiments within Kedro projects](https://docs.kedro.org/en/develop/notebooks_and_ipython/kedro_and_notebooks.html).
+In order to get the best out of the template:
+
+* Don't remove any lines from the `.gitignore` file we provide
+* Make sure your results can be reproduced by following a [data engineering convention](https://docs.kedro.org/en/stable/faq/faq.html#what-is-data-engineering-convention)
+* Don't commit data to your repository
+* Don't commit any credentials or your local configuration to your repository. Keep all your credentials and local configuration in `conf/local/`
diff --git a/science/data/02_intermediate/amazonia-10-parts/.gitkeep b/science/data/02_intermediate/amazonia-10-parts/.gitkeep
diff --git a/science/data/02_intermediate/amazonia-100-parts/.gitkeep b/science/data/02_intermediate/amazonia-100-parts/.gitkeep
diff --git a/science/data/02_intermediate/hurricane-10-parts/.gitkeep b/science/data/02_intermediate/hurricane-10-parts/.gitkeep
diff --git a/science/data/02_intermediate/hurricane-100-parts/.gitkeep b/science/data/02_intermediate/hurricane-100-parts/.gitkeep
diff --git a/science/data/03_primary/ws-10-parts/.gitkeep b/science/data/03_primary/ws-10-parts/.gitkeep
diff --git a/science/data/03_primary/ws-100-parts/.gitkeep b/science/data/03_primary/ws-100-parts/.gitkeep
diff --git a/science/requirements.in b/science/requirements.in
@@ -0,0 +1,18 @@
+ipython>=8.10
+jupyterlab>=3.0
+kedro~=0.19.6
+kedro-datasets>=3.0; python_version >= "3.9"
+kedro-datasets>=1.0; python_version < "3.9"
+kedro-datasets[netcdf, rioxarray]
+kedro-telemetry>=0.3.1
+kedro-viz>=6.7.0
+notebook
+pytest~=7.2
+pytest-cov~=3.0
+pytest-mock>=1.7.1, <2.0
+ruff~=0.1.8
+matplotlib
+cartopy
+scikit-image
+pillow
+opencv-python