This repository contains the source code for our SIGCOMM'22 paper "Multi-Resource Interleaving for Deep Learning Training".
- simulator/ contains code for simulation and is adapted from Tiresias. Please refer to
<repo>/simulator/README.md
for detailed information. - cluster_exp/ contains code for real-cluster experiment. Please refer to
<repo>/cluster_exp/README.md
for detailed information.
Please refer to <repo>/simulator/README.md
and <repo>/cluster_exp/README.md
for details.
Note: Due to the execution scripts of testbed experiments are highly related to intracompany platform, we only demonstrate the functionality and show the pseudocode of the related scripts (e.g., run.sh, prepare_env.sh). Please adjust to your platform if you would like to execute the testbed experiment.
For any question, please contact zhaoyh98 at pku dot edu dot cn