Grammar Variational Autoencoder

This repository contains training and sampling code for the paper: Grammar Variational Autoencoder.

Creating datasets

To create the molecule datasets, call in this directory:

unzip ../rockyou-processed.zip -d data/
python make_zinc_dataset_grammar.py

Training

To train the models, call:

python train_zinc.py % the grammar model
python train_zinc.py --latent_dim=2 --epochs=50` % train a model with a 2D latent space and 50 epochs

Sampling

The file molecule_vae.py can be used to encode and decode SMILES strings. For a demo run:

python encode_decode_zinc.py
python encode_decode_zinc.py --latent_dim=2 --epochs=50 --num_samples=200 % sample 200 random strings from the model trained with a 2D latent space and 50 epochs

Other details

Model is defined in models/model_zinc.py. Modify _buildEncoder and _encoderMeanVar in a similar manner to change the model i.e. add same additonal layers to both functions.
The folder results/ contains some trained models which can be samples using the above commands. Note that the arguments need to be fed appropriately to the files.
Some samples already generated from the modles are in samples/.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Grammar Variational Autoencoder

Creating datasets

Training

Sampling

Other details

Files

README.md

Latest commit

History

README.md

File metadata and controls

Grammar Variational Autoencoder

Creating datasets

Training

Sampling

Other details