Skip to content

jun-bun/rab-tf-agents

 
 

Repository files navigation

Diagram

A distributed system for Reinforcement Learning / Training

Support for multiple environments which produces a Trajectory into a replaybuffer.
The base model is a DQN.
Environments and Policy are stored in object storage , while the replaybuffer is on bigtable.
While DQN is an on-policy algorithm, in a distributed environment the collection policy is sometimes stale compared to the training policy. In this case, we are training off policy.

The code provided requires certain configuration / resources in order to run:
We used Google Cloud Platform, but it may be possible to use other services
-A cloud service with:
-object storage (GCP glob)
-query based database (bigtable)
-Docker orchestration (Kubernetes)
-TPU / GPU allocation
-Service Authentication
-An Environment which outputs (State Observations, Available Actions, previous_state_Reward)
-We have included 2 open source environments /cartpole and /breakout to test.
-/crane contains the ML training code but lacks the environment because it is currently closed source.

Multi-Environment Deployment :

With Docker Orchestration Deploy :
https://github.com/jun-bun/rab-tf-agents/blob/master/deploy/Dockerfile
Note we use a specific docker build which supports Unity on Linux. The base container is hosted here : https://hub.docker.com/r/tenserflow/gpu-unity-ubuntu-xfce-novnc

Single Environment Test :

python3 -m breakout.collect_to_bigtable

Training:

python3 -m breakout.train_from_bigtable

Demo

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 92.4%
  • Jupyter Notebook 7.6%