advantage

Named after the RL "advantage" function, advantage is a TensorFlow-based Reinforcement Learning Framework. This framework allows for easy deployment of various RL algorithms, both discrete (i.e. Atari games) and continuous (i.e. Robotics) action-space models, with a little amount of coding. advantage is compatable with OpenAI Gym. Users can develop simulators using OpenAI Gym, and then simply using configuration files, train their models in the simulator. Trained models can then be easily deployed using TensorFlow protobufs. advantage's goal is to implement the common paradigms of Reinforcement Learning to take advantage of code reuse when implementing models; this allows for new models to be added to the framework with ease.

Will be added to PyPI shortly

Currently supports:

Deep-Q Network

In the Works:

Moving from protobufs to gin-config
PPO, A3C, EPG, N-step Q, Soft-AC, and distributed training

Planned additions:

Value-based:
- C51, Implicit Quantile Agents
Multi-agent

Dependencies

These are the tested dependencies. Although higher versions will probably work.

tensorflow==1.10.0 
gym
python3.5.2

Installing

git clone https://github.com/oneTimePad/advantage.git

export PYTHONPATH=$PYTHONPATH:{path_to_advantage_package}

Build protobufs

{path_to_advantage}/scripts/build_protos.sh

Training

import advantage as adv
agent = adv.make("{path_to}/samples/dqn.config")
agent.train()

Inference

For Inference, the context manager infer opens up an inference session.

with agent.infer() as infer:
    env = infer.env
    for _ in infer.run_trajectory(run_through=False):
        env.render()

Open with .reuse() to open a reusable inference session that isn't closed on exit.

infer_session = agent.infer()
with infer_session.reuse() as infer:
    env = infer.env
    for _ in infer.run_trajectory(run_through=False):
        env.render()

Samples

CartPole-v0 notebook

If there are any problems with the learning algorithm please open an issue

Name		Name	Last commit message	Last commit date
Latest commit History 209 Commits
agents		agents
approximators		approximators
buffers		buffers
builders		builders
checkpoint		checkpoint
elements		elements
environments		environments
loggers		loggers
managers		managers
models		models
protos		protos
samples		samples
scripts		scripts
utils		utils
.gitignore		.gitignore
README.md		README.md
__init__.py		__init__.py
advantage.gif		advantage.gif
api.py		api.py
exception.py		exception.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

advantage

Dependencies

Installing

Training

Inference

Samples

About

Releases

Packages

Languages

oneTimePad/advantage

Folders and files

Latest commit

History

Repository files navigation

advantage

Dependencies

Installing

Training

Inference

Samples

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages