This repository is the official implementation of Distilled Graph Attention Policy Network (DGAPN) in the paper Spatial Graph Attention and Curiosity-driven Policy for Antiviral Drug Discovery (matching branch iclr
). The implementation of Spatial Graph Attention Network (sGAT) submodule can be found here.
git clone https://github.com/yulun-rayn/DGAPN
cd DGAPN
git submodule update --init --recursive
conda config --append channels conda-forge
conda create -n dgapn-env --file requirements.txt
conda activate dgapn-env
pip install -e sGAT
pip install crem==0.2.5
* make sure to install the right versions for your toolkit
To evaluate molecular docking scores, the docking program AutoDock-GPU 1.5.3 (guideline) and Open Babel 3.1.1 (guideline) need to be installed. After installations, change ADT_PATH
and OBABEL_PATH
in the score function to the corresponding executable paths on your system.
The provided resources are for docking in the catalytic site of NSP15. If docking against a new protein is desired, several input receptor files need to be generated, see the sub-directory for more details.
Once the conda environment is set up, the function call to train the DGAPN is:
./main_train.sh &
A list of flags may be found in main_train.sh
and main_train.py
for experimentation with different network and training parameters (--reward_type dock
only if docking software has been set up; different --run_id
for each task if multiple docking tasks are running at the same time). The run log, models and generated molecules are saved under *artifact_path*/saves
; the tensorboard log is saved under *artifact_path*/runs
.
If you wish to produce a pre-trained graph embedding model for DGAPN training, or just want to try out supervised learning with spatial graph attention network, check out sGAT
for the submodule instructions (installation steps can be skipped if a DGAPN environment is already established).
After training a model, use main_evaluate.sh
to produce and evaluate molecules. The flag --model_path
should be modified to direct to a trained DGAPN model.
./main_evaluate.sh
Generated molecules are saved under *artifact_path*/*name*
as a csv file, where each line contains a molecule's SMILES string and associated score.
Trained DGAPN model on docking reward and samples of molecules generated in evaluation can be found here.
Contributions are welcome! All content here is licensed under the MIT license.