This repo contains the implementation of the Proximal Policy Optimization algorithm using the Keras library on a custom environment made with Unity 3D engine.
Important details about this repository:
- Unity engine version used to build the environment = 2019.3.15f1
- ML-Agents branch = release_1
- Environment binary:
- For Windows = Learning-Agents--r1 (.exe)
- For Linux(Headless/Server build) = RL-agent (.x86_64)
- For Linux(Normal build) = RL-agent (.x86_64)
Windows environment binary is used in this repo. But if you want to use the Linux environment binary, then change the ENV_NAME in train.py & test.py scripts to the correct path pointing to those binaries stored over here.
- Introduction
- Environment Specific Details
- Setup Instructions
- Getting Started
- Motivation and Learning
- License
- Acknowledgements
- Check out this video to see the trained agent using the learned navigation skills to find the flag in a closed environment, which is divided into nine different segments.
- And if you want to see the training phase/process of this agent, then check out this video.
These are some details which you should know before hand. And I think without knowing this, you might get confused because some of the Keras implementations are environment-dependent.
Check this doc for detailed information.
A small overview of the environment:
- Observation/State space: Vectorized (unlike Image)
- Action space: Continuous (unlike Discrete)
- Action shape: (num of agents, 2) (Here num of agents alive at every env step is 1, so shape(1, 2))
- Reward System:
- (1.0/MaxStep) per step (MaxStep is used to reset the env irrespective of achieving the goal state) & the same reward is used if the agent crashes into the walls.
- +2 if the agent reaches the goal state.
Install the ML-Agents github repo release_1 branch, but if you want to use the different branch version then modify the python APIs to interact with the environment.
-
Clone this repos:
$ git clone --branch release_1 https://github.com/Unity-Technologies/ml-agents.git $ git clone https://github.com/Dhyeythumar/PPO-algo-with-custom-Unity-environment.git
-
Create and activate the python virtual environment: (Python version used - 3.8.x)
$ python -m venv myvenv $ myvenv\Scripts\activate
-
Install the dependencies: (check the exact dependency versions in requirements.txt file)
(myvenv) $ pip install -e ./ml-agents/ml-agents-envs (myvenv) $ pip install tensorflow (myvenv) $ pip install keras (myvenv) $ pip install tensorboardX
-
Now to start the training process use the following commands:
(myvenv) $ cd PPO-algo-with-custom-Unity-environment (myvenv) $ python train.py
-
Activate the tensorboard:
$ tensorboard --logdir=./training_data/summaries --port 6006
This video by OpenAI inspired me to develop something in the field of reinforcement learning. So for the first phase, I decided to create a simple RL agent who can learn navigation skills.
After completing the first phase, I gained much deeper knowledge in the RL domain and got some of my following questions answered:
- How to create custom 3D environments using the Unity engine?
- How to use ML-Agents (Unity's toolkit for reinforcement learning) to train the RL agents?
- And I also learned to implement the PPO algorithm using the Keras library. 😃
What's next? 🤔
So I have started working on the next phase of this project, which will include a multi-agent environment setup and, I am also planning to increase the difficulty level. So for more updates, stay tuned for the next video on my youtube channel.
Licensed under the MIT License.