Roadmap

This document defines a high level roadmap for the arena development.

2018

Features

Enhance Training Job
- Move to MPI-operator
- Set default CPU/Memory limit according to different types of training: tf-operator, MPI-operator
- Support Gang Scheuler
- Pytorch-operator
Training History Management
- Use CRD to manage the training history
Integrate with data
Muti-tenancy
Easy install

Stability/Reliability

end-to-end testing
unit tests
build arena docker images automatically