DeepDriveMD: Coupling streaming AI and HPC ensembles to achieve 100-1000× faster biomolecular simulations
DeepDriveMD implemented using Colmena.
This implementation of DeepDriveMD enables ML/AI-coupled simulations using three primary components. Simulation: Simulations are used to explore possible trajectories of a protein or other biomolecular system; Training: Aggregated trajectories are used to train one or more ML models. Inference: Trained ML models are used to identify conformations for subsequent iterations of simulations. A Thinker process orchestrates these components to advance the workflow toward an optimization objective.
Create a conda environment
conda create -n deepdrivemd python=3.9 -y
conda activate deepdrivemd
To install OpenMM for simulations:
conda install -c conda-forge gcc=12.1.0 -y
conda install -c conda-forge openmm -y
To install deepdrivemd
:
git clone https://github.com/ramanathanlab/deepdrivemd.git
cd deepdrivemd
make install
The workflow can be tested on a workstation (a system with a few GPUs) via:
python -m deepdrivemd.workflows.openmm_cvae -c tests/apps-enabled-workstation/test.yaml
This will generate an output directory for the run with logs, results, and task specific output folders.
Each test will write a timestamped experiment output directory to the runs/
directory.
Inside the output directory, you will find:
$ ls runs/experiment-170323-091525/
inference params.yaml result run-info runtime.log simulation train
params.yaml
: the full configuration file (default parameters included)runtime.log
: the workflow logresult
: a directory containing JSON filessimulation.json
,train.json
,inference.json
which log task results including success or failure, potential error messages, runtime statistics. This can be helpful for debugging application-level failures.simulation
,train
,inference
: output directories each containing subdirectoriesrun-<uuid>
for each submitted task. This is where the output files of your simulations, preprocessed data, model weights, etc will be written by your applications (it corresponds to the application workdir).run-info
: Parsl logs
An example, the simulation run directories may look like:
$ ls runs/experiment-170323-091525/simulation/run-08843adb-65e1-47f0-b0f8-34821aa45923:
1FME-unfolded.pdb contact_map.npy input.yaml output.yaml rmsd.npy sim.dcd sim.log
1FME-unfolded.pdb
the PDB file used to start the simulationcontact_map.npy
,rmsd.npy
: the preprocessed data files which will be input into the train and inference tasksinput.yaml
,output.yaml
: These simply log the task function input and return values, they are helpful for debugging but are not strtictly necessarysim.dcd
: the simulation trajectory file containing all the coordinate framessim.log
: a simulation log detailing the energy, steps taken, ns/day, etc
By default the runs/
directory is ignored by git.
Production runs can be configured and run analogously. See examples/bba-folding-workstation/
for a detailed example of folding the 1FME protein. The YAML files document the configuration settings and explain the use case.
Implement a DeepDriveMD workflow with custom MD simulation engines, and AI training/inference methods by inherting from the DeepDriveMDWorkflow
interface. This workflow implments the examples/bba-folding-workstation/
example:
from deepdrivemd.api import DeepDriveMDWorkflow
class DeepDriveMD_OpenMM_CVAE(DeepDriveMDWorkflow):
def __init__(
self, simulations_per_train: int, simulations_per_inference: int, **kwargs: Any
) -> None:
super().__init__(**kwargs)
self.simulations_per_train = simulations_per_train
self.simulations_per_inference = simulations_per_inference
# Make sure there has been at least one training task
# complete before running inference
self.model_weights_available: bool = False
# For batching training/inference inputs
self.train_input = CVAETrainInput(contact_map_paths=[], rmsd_paths=[])
self.inference_input = CVAEInferenceInput(
contact_map_paths=[], rmsd_paths=[], model_weight_path=Path()
)
# Communicate results between agents
self.simulation_input_queue: Queue[MDSimulationInput] = Queue()
def simulate(self) -> None:
"""Submit either a new outlier to simulate, or a starting conformer."""
with self.simulation_govenor:
if not self.simulation_input_queue.empty():
inputs = self.simulation_input_queue.get()
else:
inputs = MDSimulationInput(sim_dir=next(self.simulation_input_dirs))
self.submit_task("simulation", inputs)
def train(self) -> None:
"""Submit a new training task."""
self.submit_task("train", self.train_input)
def inference(self) -> None:
"""Submit a new inference task once model weights are available."""
while not self.model_weights_available:
time.sleep(1)
self.submit_task("inference", self.inference_input)
def handle_simulation_output(self, output: MDSimulationOutput) -> None:
"""When a simulation finishes, decide to train a new model or infer outliers."""
# Collect simulation results
self.train_input.append(output.contact_map_path, output.rmsd_path)
self.inference_input.append(output.contact_map_path, output.rmsd_path)
# Signal train/inference tasks
num_sims = len(self.train_input)
if num_sims % self.simulations_per_train == 0:
self.run_training.set()
if num_sims % self.simulations_per_inference == 0:
self.run_inference.set()
def handle_train_output(self, output: CVAETrainOutput) -> None:
"""When training finishes, update the model weights to use for inference."""
self.inference_input.model_weight_path = output.model_weight_path
self.model_weights_available = True
def handle_inference_output(self, output: CVAEInferenceOutput) -> None:
"""When inference finishes, update the simulation queue with the latest outliers."""
with self.simulation_govenor:
self.simulation_input_queue.queue.clear() # Remove old outliers
for sim_dir, sim_frame in zip(output.sim_dirs, output.sim_frames):
self.simulation_input_queue.put(
MDSimulationInput(sim_dir=sim_dir, sim_frame=sim_frame)
)
Please report bugs, enhancement requests, or questions through the Issue Tracker.
If you are looking to contribute, please see CONTRIBUTING.md
.
DeepDriveMD has a MIT license, as seen in the LICENSE.md
file.
If you use DeepDriveMD in your research, please cite this paper:
@inproceedings{brace2022coupling,
title={Coupling streaming ai and hpc ensembles to achieve 100--1000$\times$ faster biomolecular simulations},
author={Brace, Alexander and Yakushin, Igor and Ma, Heng and Trifan, Anda and Munson, Todd and Foster, Ian and Ramanathan, Arvind and Lee, Hyungro and Turilli, Matteo and Jha, Shantenu},
booktitle={2022 IEEE International Parallel and Distributed Processing Symposium (IPDPS)},
pages={806--816},
year={2022},
organization={IEEE}
}