Road map (#330)

Adds a roadmap to the documentation
janelia-cellmap · Nov 12, 2024 · 975b8b8 · 975b8b8
2 parents 460cf2a + 637859a
commit 975b8b8
Show file tree

Hide file tree

Showing 2 changed files with 79 additions and 1 deletion.
diff --git a/docs/source/index.rst b/docs/source/index.rst
@@ -15,8 +15,9 @@
    docker
    aws
    cosem_starter
+   roadmap
    autoapi/index
    cli
 
 .. include:: ../../README.md
-   :parser: myst_parser.sphinx_
+   :parser: myst_parser.sphinx_
diff --git a/docs/source/roadmap.rst b/docs/source/roadmap.rst
@@ -0,0 +1,77 @@
+.. _sec_roadmap:
+
+Road Map
+========
+
+Overview
+--------
+
++-----------------------------------+------------------+-------------------------------+
+| Task                              | Priority         | Current State                 |
++===================================+==================+===============================+
+| Write Documentation               | High             | Started with a long way to go |
++-----------------------------------+------------------+-------------------------------+
+| Simplify configurations           | High             | First draft complete          |
++-----------------------------------+------------------+-------------------------------+
+| Develop Data Conventions          | High             | First draft complete          |
++-----------------------------------+------------------+-------------------------------+
+| Improve Blockwise Post-Processing | Low              | Not Started                   |
++-----------------------------------+------------------+-------------------------------+
+| Simplify Array handling           | High             | Almost done (Up/Down sampling)|
++-----------------------------------+------------------+-------------------------------+
+
+Detailed Road Map
+-----------------
+
+ - [ ] Write Documentation
+     - [ ] tutorials: not more than three, simple and continuously tested (with Github actions, small U-Net on CPU could work)
+         - [x] Basic tutorial: train a U-Net on a toy dataset
+           - [ ] Parametrize the basic tutorial across tasks (instance/semantic segmentation).
+           - [ ] Improve visualizations. Move some simple plotting functions to DaCapo.
+           - [ ] Add a pure pytorch implementation to show benefits side-by-side
+           - [ ] Track performance metrics (e.g., loss, accuracy, etc.) so we can make sure we aren't regressing
+         - [ ] semantic segmentation (LM and EM)
+         - [ ] instance segmentation (LM or EM, can be simulated)
+     - [ ] general documentation of CLI, also API for developers (curate docstrings)
+ - [x] Simplify configurations
+     - [x] Depricate old configs
+     - [x] Add simplified config for simple cases
+     - [x] can still get rid of `*Config` classes
+ - [x] Develop Data Conventions
+     - [x] document conventions
+     - [ ] convenience scripts to convert dataset into our convention (even starting from directories of PNG files)
+ - [ ] Improve Blockwise Post-Processing
+     - [ ] De-duplicate code between “in-memory” and “block-wise” processing
+         - [ ] have only block-wise algorithms, use those also for “in-memory”
+         - [ ] no more “in-memory”, this is just a run with a different Compute Context
+     - [ ] Incorporate `volara` into DaCapo (embargo until January)
+     - [ ] Improve debugging support (logging of chain of commands for reproducible runs)
+     - [ ] Split long post-processing steps into several smaller ones for composability (e.g., support running each step independently if we want to support choosing between `waterz` and `mutex_watershed` for fragment generation or agglomeration)
+ - [x] Incorporate `funlib.persistence` adaptors.
+     - [x] all of those can be adapters:
+         - [x] Binarize Labels into Mask
+         - [x] Scale/Shift intensities
+         - [ ] Up/Down sample (if easily possible)
+         - [ ] DVID source
+         - [x] Datatype conversions
+         - [x] everything else
+     - [x] simplify array configs accordingly
+
+Can Have
+--------
+
+ - [ ] Support other stats stores. Too much time, effort and code was put into the stats and didn’t provide a very nice interface:
+     - [ ] defining variables to store
+     - [ ] efficiently batch writing, storing and reading stats to both files and mongodb
+     - [ ] visualizing stats.
+     - [ ] Jeff and Marwan suggest MLFlow instead of WandB
+ - [ ] Support for slurm clusters
+ - [ ] Support for cloud computing (AWS)
+ - [ ] Lazy loading of dependencies (import takes too long)
+ - [ ] Support bioimage model spec for model dissemination
+
+Non-Goals (for v1.0)
+--------------------
+
+- custom dash board
+- GUI to run experiments