Skip to content
/ drl Public

Non-exhaustive overview of classic Model-Free Deep Reinforcement Learning algorithms.

Notifications You must be signed in to change notification settings

eliotwalt/drl

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

34 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

drl

This repository contains implementations of different classic Model-Free Deep Reinforcement Learning algorithms using PyTorch and OpenAI Gym environements. Currently, only vector states environments are supported such as LunarLander-v2.

Structure

The repository is structured as follows:

  • executor.py: Training execution script
  • utils.py: Helper functions for execution
  • config.py: Helper functions for configuration
  • defaults/configs/: Default JSON configuration files for each algorithms
  • drl/: Main package's directory
    • models.py: PyTorch models
    • trainers.py: Training execution classes
    • off_policy/: Off-policy algorithms
      • agents.py: Off-policy agents
      • replay_buffer.py: Replay buffer implementation
    • on_policy/: On-policy algorithms
      • agents.py: On-policy agents
      • envs.py: Multiprocessing environments

Algorithms

Currently implemented algorithms are:

  • Deep Q-Learning: drl.off_policy.agents.DQLAgent [paper]
  • Deep Q-Network: drl.off_policy.agents.DQNAgent [paper]
  • Double Deep Q-Netwok: drl.off_policy.agents.DDQNAgent [paper]
  • Reinforce: drl.on_policy.agents.ReinforceAgent [paper]
  • Actor Critic: drl.on_policy.agents.ActorCriticAgent [book]
  • Advantage Actor Critic (A2C) with parallel environements: drl.on_policy.agents.A2CAgent [paper]

Usage

todo: requirements The execution is handle by executor.py.

usage: executor.py [-h] -c CFG

optional arguments:
  -h, --help         show this help message and exit
  -c CFG, --cfg CFG  path to config file

Agent, trainer and envrionements are created based on a JSON configuration file passed as an argument. For reference, see defaults/configs/. For instance, training an A2C agent using defaults/configs/lunarlander-v2.json is achieved by:

python exectuor.py -c defaults/configs/lunarlander-v2.json

Todo

  • Fix Dueling architectures
  • Continuous action space algorithms: ddpg, ppo, sac, ...
  • Streamline API

About

Non-exhaustive overview of classic Model-Free Deep Reinforcement Learning algorithms.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages