PPO Algorithm with a custom environment

This repo contains the implementation of the Proximal Policy Optimization algorithm using the Keras library on a custom environment made with Unity 3D engine.

Important details about this repository:

Unity engine version used to build the environment = 2019.3.15f1

ML-Agents branch = release_1

Environment binary:

For Windows = Learning-Agents--r1 (.exe)

For Linux(Headless/Server build) = RL-agent (.x86_64)

For Linux(Normal build) = RL-agent (.x86_64)

Windows environment binary is used in this repo. But if you want to use the Linux environment binary, then change the ENV_NAME in train.py & test.py scripts to the correct path pointing to those binaries stored over here.

Introduction

Check out this video to see the trained agent using the learned navigation skills to find the flag in a closed environment, which is divided into nine different segments.
And if you want to see the training phase/process of this agent, then check out this video.

Environment Specific Details

These are some details which you should know before hand. And I think without knowing this, you might get confused because some of the Keras implementations are environment-dependent.

Check this doc for detailed information.

A small overview of the environment:

Observation/State space: Vectorized (unlike Image)
Action space: Continuous (unlike Discrete)
Action shape: (num of agents, 2) (Here num of agents alive at every env step is 1, so shape(1, 2))
Reward System:
- (1.0/MaxStep) per step (MaxStep is used to reset the env irrespective of achieving the goal state) & the same reward is used if the agent crashes into the walls.
- +2 if the agent reaches the goal state.

Setup Instructions

Install the ML-Agents github repo release_1 branch, but if you want to use the different branch version then modify the python APIs to interact with the environment.

Clone this repos:

$ git clone --branch release_1 https://github.com/Unity-Technologies/ml-agents.git

$ git clone https://github.com/Dhyeythumar/PPO-algo-with-custom-Unity-environment.git

Create and activate the python virtual environment: (Python version used - 3.8.x)
```
$ python -m venv myvenv
$ myvenv\Scripts\activate
```

Install the dependencies: (check the exact dependency versions in requirements.txt file)

(myvenv) $ pip install -e ./ml-agents/ml-agents-envs
(myvenv) $ pip install tensorflow
(myvenv) $ pip install keras
(myvenv) $ pip install tensorboardX

Getting Started

Now to start the training process use the following commands:

(myvenv) $ cd PPO-algo-with-custom-Unity-environment
(myvenv) $ python train.py

Activate the tensorboard:

$ tensorboard --logdir=./training_data/summaries --port 6006

Motivation and Learning

This video by OpenAI inspired me to develop something in the field of reinforcement learning. So for the first phase, I decided to create a simple RL agent who can learn navigation skills.

After completing the first phase, I gained much deeper knowledge in the RL domain and got some of my following questions answered:

How to create custom 3D environments using the Unity engine?
How to use ML-Agents (Unity's toolkit for reinforcement learning) to train the RL agents?
And I also learned to implement the PPO algorithm using the Keras library. 😃

What's next? 🤔

So I have started working on the next phase of this project, which will include a multi-agent environment setup and, I am also planning to increase the difficulty level. So for more updates, stay tuned for the next video on my youtube channel.

License

Licensed under the MIT License.

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
images		images
rl_env_binary		rl_env_binary
.gitignore		.gitignore
Environment_Details.md		Environment_Details.md
LICENSE.md		LICENSE.md
README.md		README.md
env_driver.py		env_driver.py
gym_train.py		gym_train.py
requirements.txt		requirements.txt
statistics.py		statistics.py
test.py		test.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PPO Algorithm with a custom environment

This repo contains the implementation of the Proximal Policy Optimization algorithm using the Keras library on a custom environment made with Unity 3D engine.

What’s In This Document

Introduction

Environment Specific Details

Setup Instructions

Getting Started

Motivation and Learning

License

Acknowledgements

About

Languages

License

dhyeythumar/PPO-algo-with-custom-Unity-environment

Folders and files

Latest commit

History

Repository files navigation

PPO Algorithm with a custom environment

This repo contains the implementation of the Proximal Policy Optimization algorithm using the Keras library on a custom environment made with Unity 3D engine.

What’s In This Document

Introduction

Environment Specific Details

Setup Instructions

Getting Started

Motivation and Learning

License

Acknowledgements

About

Topics

Resources

License

Stars

Watchers

Forks

Languages