Skip to content

Latest commit

 

History

History
105 lines (103 loc) · 5.84 KB

todo.org

File metadata and controls

105 lines (103 loc) · 5.84 KB

Random thoughts

  • How can the data pipeline be more disciplined?
    • Set up best practices
  • Have one pipeline which is always running, and perhaps another script to send messages to the main pipeline in-place of, this idea was regarding scheduling.
  • The pipeline is already implementation-agnostic, but how does that translate to a salable system? perhaps add another layer which is specific to the library on top of which a model is to b executed?
  • Online and offline pipelines? The current system is designed for offline training.
  • How can the deployment process be automated using the pipeline?
  • Better version control for the models and data/metadata.
  • Group a set of files as a experiment.

Add a function that will be executed at the end of the loop, where I can add stuff like moving files, etc.

Commit following each train and eval loop.

Is there a separate need for MODEL_DIR_SUFFIX? yes!

Rethink allow_delete_model_dir

Rethink how the training time is recorded to ensure a model that ended up failing to train can be relaunched. Or is this even an good behaviour to have?

Let the user override the checking-modified-time behaviour

mlflow integration [8/8]

Will be using the mlflow tracking interface

The experiment name will be the name of the script running [4/4]

A script being launched by the subprocess is to be considered an experiment

Each experiment can have a different set of versions, which will be represented by a run.

Can an experiment have versions with the same name? Preferably not

  • Old versions will be mlflow_deleted. That is based on the assumption that the version is being overwritten.

Are the mlrun stuff to be gitted? YES/for now

The metric_container will log all stuff that is logged when log_metrics is being called

The train and eval functions are expected to return a metric_container which will be logged in both normal and mlflow

  • The training output will not be logged by mlflow. Only the eval outputs will be logged, since that is what we want to look at. If someone want the train output they’d have to log it during the run.

The UI will be launched along side the pipeline. Also allow to launch the ui separately through the mlpipeline.

  • CLOSING NOTE [2019-07-15 Mon 18:38]
    This makes no sense if the tracking uri is set to a remote server

The files copied will also be logged through the mlflow artifacts

The version will log the values passed through it as parameters of the run

Tag if a model is in being trained or finished training.

  • CLOSING NOTE [2019-07-15 Mon 18:39]
    mlflow runs already have a runstatus

tensorboardx integration

Refactors [3/3]

Rename model to experiment

Versions use easydict

Reduce the dependencies on Versions.

  • CLOSING NOTE [2019-07-15 Mon 18:39]

Add a separate export mode

  • CLOSING NOTE [2019-07-12 Fri 16:53]
  • When in this mode, it will execute the `export_model` method for all the experiments for all versions.

test mode and export mode try out all the versions instead of one [2/2]

in export mode

  • CLOSING NOTE [2019-03-27 Wed 14:45]

in test mode

  • CLOSING NOTE [2019-07-28 Sun 14:43]

mlpipeline sponsored breaks

  • CLOSING NOTE [2019-07-28 Sun 14:44]
  • The idea is to take away the need to comment and uncomment break statements
  • additionally can have this work on multiple levels: Do you want to run a whole epoc, pass a setting, if not it’ll break when it says so.

git support

  • git is used not to track development, but to track the experiments.
  • The steps
    1. Checkout to an experiment branch. Is this necessary?
    2. Before a run, stage everything, git the repo and store the hash. This is assuming the related files are all tracked.
      • is it good practice to stage everything? or find a way to check all the files that are being loaded? I know I can track the files being imported, how about other files being accessed?
  • One way is to provide an api to open files which will track the files by itself.
  • stakoverflow: check what files are open in Python
  • Another option is to look into using mlflows version control interface

Move the pytorch base to this repo

  • CLOSING NOTE [2019-07-28 Sun 14:44]

Rethink the steps approach; think sklearn.pipeline or https://github.com/Neuraxio/Neuraxle

or maybe add a module for sklearn.pipeline?

Improve testing approaches?