Optimizing Backward Policies in GFlowNets via Trajectory Likelihood Maximization

Official repository for the paper Optimizing Backward Policies in GFlowNets via Trajectory Likelihood Maximization.

Timofei Gritsaev, Nikita Morozov, Sergei Samsonov, Daniil Tiapkin.

Abstract

Generative Flow Networks (GFlowNets) are a family of generative models that learn to sample objects with probabilities proportional to a given reward function. The key concept behind GFlowNets is the use of two stochastic policies: a forward policy, which incrementally constructs compositional objects, and a backward policy, which sequentially deconstructs them. Recent results show a close relationship between GFlowNet training and entropy-regularized reinforcement learning (RL) problems with a particular reward design. However, this connection applies only in the setting of a fixed backward policy, which might be a significant limitation. As a remedy to this problem, we introduce a simple backward policy optimization algorithm that involves direct maximization of the value function in an entropy-regularized Markov Decision Process (MDP) over intermediate rewards. We provide an extensive experimental evaluation of the proposed approach across various benchmarks in combination with both RL and GFlowNet algorithms and demonstrate its faster convergence and mode discovery in complex environments.

Installation

Create conda environment:

conda create -n gflownet-tlm python=3.10
conda activate gflownet-tlm

Install dependencies for bitseq and hypergrid experiments:

pip install -r requirements.txt

Install dependencies for molecular experiments:

pip install -e . --find-links https://data.pyg.org/whl/torch-2.1.2+cu121.html

Hypergrids

The code for this section is based on the open repository (https://github.com/d-tiapkin/gflownet-rl) and extensively uses the torchgfn library (https://github.com/GFNOrg/torchgfn).

Path to configurations (utilizes ml-collections library):

General configuration: hypergrid/experiments/config/general.py
Algorithm: hypergrid/experiments/config/algo.py
Environment: hypergrid/experiments/config/hypergrid.py

Available options:

List of available algorithms: db, tb, subtb, soft_dqn, and munchausen_dqn;
List of available backward approaches: uniform, naive, maxent, and tlm.

Run the experiment from the hypergrid/ directory with standard rewards, seed 3 on the algorithm munchausen_dqn and backward approach tlm:

python run_hypergrid_exp.py --general experiments/config/general.py:3 --env experiments/config/hypergrid.py:standard --algo experiments/config/algo.py:munchausen_dqn --algo.backward_approach tlm

To train with uniform backward policy for this setting:

python run_hypergrid_exp.py --general experiments/config/general.py:3 --env experiments/config/hypergrid.py:standard --algo experiments/config/algo.py:munchausen_dqn --algo.backward_approach uniform

Bit sequences

Code for this environment is based on the open repository (https://github.com/d-tiapkin/gflownet-rl).

To control hyperparameters, we refer to bitseq/run.py.

Examples of running DB from the bitseq/ directory with learning rate 0.002 and varying backward approaches:

python bitseq/run.py --objective db --learning_rate 0.002 --backward_approach tlm

python bitseq/run.py --objective db --learning_rate 0.002 --backward_approach uniform

python bitseq/run.py --objective db --learning_rate 0.002 --backward_approach naive

Molecules

The experiments with molecular environments leverage the existing codebase for molecule generation using GFlowNets (https://github.com/recursionpharma/gflownet), which is distributed under the MIT license.

To control hyperparameters, we refer to mols/tasks/qm9.py and mols/tasks/seh_frag.py.

List of available algorithms: db, tb, subtb, and dqn;
List of available backward approaches: uniform, naive, maxent, and tlm.

Run from the root directory to reproduce the QM9 experiment with Munchausen DQN, learning rate 5e-4 and backward approach tlm:

python -m mols.tasks.qm9 --seed 1 --algo dqn --lr 5e-4 --backward_approach tlm

To run the sEH experiment with the same hyperparameters:

python -m mols.tasks.seh_frag --seed 1 --algo dqn --lr 5e-4 --backward_approach tlm

Citation

Please cite our article if you find it helpful in your work

@article{gritsaev2024optimizing,
  title={Optimizing Backward Policies in GFlowNets via Trajectory Likelihood Maximization},
  author={Gritsaev, Timofei and Morozov, Nikita and Samsonov, Sergey and Tiapkin, Daniil},
  journal={arXiv preprint arXiv:2410.15474},
  year={2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
bitseq		bitseq
hypergrid		hypergrid
mols		mols
requirements		requirements
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
VERSION		VERSION
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Optimizing Backward Policies in GFlowNets via Trajectory Likelihood Maximization

Abstract

Installation

Hypergrids

Bit sequences

Molecules

Citation

About

Releases

Packages

Languages

License

tgritsaev/gflownet-tlm

Folders and files

Latest commit

History

Repository files navigation

Optimizing Backward Policies in GFlowNets via Trajectory Likelihood Maximization

Abstract

Installation

Hypergrids

Bit sequences

Molecules

Citation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages