Named after the RL "advantage" function, advantage is a TensorFlow-based Reinforcement Learning Framework. This framework allows for easy deployment of various RL algorithms, both discrete (i.e. Atari games) and continuous (i.e. Robotics) action-space models, with a little amount of coding. advantage is compatable with OpenAI Gym. Users can develop simulators using OpenAI Gym, and then simply using configuration files, train their models in the simulator. Trained models can then be easily deployed using TensorFlow protobufs. advantage's goal is to implement the common paradigms of Reinforcement Learning to take advantage of code reuse when implementing models; this allows for new models to be added to the framework with ease.
Will be added to PyPI shortly
Currently supports:
- Deep-Q Network
In the Works:
- Moving from protobufs to gin-config
- PPO, A3C, EPG, N-step Q, Soft-AC, and distributed training
Planned additions:
- Value-based:
- C51, Implicit Quantile Agents
- Multi-agent
These are the tested dependencies. Although higher versions will probably work.
tensorflow==1.10.0
gym
python3.5.2
git clone https://github.com/oneTimePad/advantage.git
export PYTHONPATH=$PYTHONPATH:{path_to_advantage_package}
Build protobufs
{path_to_advantage}/scripts/build_protos.sh
import advantage as adv
agent = adv.make("{path_to}/samples/dqn.config")
agent.train()
For Inference, the context manager infer
opens up
an inference session.
with agent.infer() as infer:
env = infer.env
for _ in infer.run_trajectory(run_through=False):
env.render()
Open with .reuse()
to
open a reusable inference session that isn't closed
on exit.
infer_session = agent.infer()
with infer_session.reuse() as infer:
env = infer.env
for _ in infer.run_trajectory(run_through=False):
env.render()
If there are any problems with the learning algorithm please open an issue