Safe Landings In Deep Space

Disclaimer

This project was conducted for University of Toronto - School of Continuing Studies (SCS) as part of the Intelligent Agents & Reinforcement Learning - 3547 Course.

Submitted By:

Adnan Lanewala
Nareshkumar Patel
Nisarg Patel

Introduction

Background

Rocket trajectory optimization is a classic topic in Optimal Control.

According to Pontryagin's maximum principle it's optimal to fire engine full throttle or turn it off. That's the reason this environment is OK to have discreet actions (engine on or off).

Landing pad is always at coordinates (0,0). Coordinates are the first two numbers in state vector. Reward for moving from the top of the screen to landing pad and zero speed is about 100..140 points. If lander moves away from landing pad it loses reward back. Episode finishes if the lander crashes or comes to rest, receiving additional -100 or +100 points. Each leg ground contact is +10. Firing main engine is -0.3 points each frame. Firing side engine is -0.03 points each frame. Solved is 200 points. Landing outside landing pad is possible. Fuel is infinite, so an agent can learn to fly and then land on its first attempt. Four discrete actions available: do nothing, fire left orientation engine, fire main engine, fire right orientation engine.

Goal

Navigate a lander to its landing pad safely and have a safe touch down. A sample of heuristic landing is shown below.

Our Approach

Algorithm

We used the Q-Learning Algorithm with a Deep Neural Network developed in Keras library. This algorithm is also known as the DQN algorithm. For more information on the algorithm please check out the paper Playing Atari with Deep Reinforcement Learning

Keras Model

We utilized the keras library to create a neural network that has state space as the input layer and output layer is the actions recommended by the agent. The input layer has 8 nodes, then there is a hidden layer with 100 nodes and another hidden layer with 50 nodes. All hidden layers use the relu activation function. The output layer is a linear activation function with 4 nodes representing 4 discrete actions.

Agent Learning History

EPISODE = 10

EPISODE = 50

EPISODE = 100

EPISODE = 150

EPISODE = 200

Reward Vs. Episode Plot

Agent Testing

Once the agent was fully trained to land safely, the weights along with the Keras model configuration was stored in a modelweights directory such that it can be later retrieved and we wont need to re-train the agent from scratch. The agent was tested for 100 episodes. Below is a sample of landing carried out by a fully trained agent.

Reward Vs. Episode Plot

Folder Structure

SCS-RL-3547-Final-Project
│   assets (Git README images store directory)
│   gym (Open AI Gym environment)
│   modelweights (model history)
│   │   LunarLander.h5 (keras model file)
│   presentation
│   │   Safe_Landings_In_Deep_Space_Presentation.ppsx (Presentation show file)
│   │   Safe_Landings_In_Deep_Space_Presentation.pptx (Powerpoint file)
│   Lunar_Lander_Keyboard_Play.ipynb (Human Keyboard input game play)
│   REAMDE.md (readme file)
│   Safe_Landings_Implementation.ipynb (Deep Q-Learning algorithm implementation for AI Agent)
│   test_setup.ipynb (Jupyter notebook to test and check the gym environment along with LunarLander-V2)

Setup

Libraries Used:

Open AI Gym (v0.15.4) - https://github.com/openai/gym
Open AI Box2D-py (v2.3.8) (Comes with gym) - https://github.com/openai/box2d-py
Keras (v2.2.5) - https://pypi.org/project/Keras/

Anaconda Packages:

Swig (v3.0.12) (Anaconda Package) - https://anaconda.org/anaconda/swig
pystan (v2.19.1.1) (Anaconda Package) - https://anaconda.org/conda-forge/pystan
pyglet (v1.4.8) (Anaconda Package) - https://anaconda.org/conda-forge/pyglet

Note: If you see errors during installation in regards to `Mujoco.py` kindly ignore it as we will not be using the `MUJOCO` environment. If you are windows user please check out the Windows Instructions to make sure you have the necessary setup before installing any of the above libraries as you may run into errors.

Windows Instructions

Install & Setup SWIG

Go to http://www.swig.org/download.html to get the latest version of the SWIG library or go to https://sourceforge.net/projects/swig/files/swigwin/swigwin-4.0.1/swigwin-4.0.1.zip/download?use_mirror=iweb to download v4.0.1 directly.
Make sure to download swigwin-x.x.x for windows as indicated by the win in the file name. Please note x indicates the version number.
Open & Extract the swigwin-4.0.1.zip folder.
Then go to the extracted swigwin-4.0.1 folder.
Make sure there is swig.exe file inside the swigwin-4.0.1 folder.
Copy the entire swigwin-4.0.1 folder inside the C:\ drive.

Install & Setup Microsoft Visual C++ Build Tools for Visual Studio 2019

Go to https://visualstudio.microsoft.com/downloads/
Scroll down to the bottom of the page and click on Tools for Visual Studio 2019.
Then browse to Build Tools for Visual Studio 2019 and click on the Download button to download the file.
Once you've downloaded it, run the executable file to begin the installation.
Once the application opens up, click on the Workloads tab.
Select the C++ build tools by clicking on the checkbox and begin the installation.

Setup Windows Environment Variable

Open the Windows Environment Variable setting window by running the following command in Run Window rundll32 sysdm.cpl,EditEnvironmentVariables. Alternatively it can be opened by going toStart -> Right Click on My Computer -> Select Properties -> Click on Advanced system settings -> Click on Advanced tab -> Click on Environment Variables button.
In a new window go to C:\Program Files (x86)\Windows Kits\10\Include, then open the folder it maybe similar to 10.0.18362.0. Underneath that folder, open ucrt. Your Path in the Explorer should be similar to C:\Program Files (x86)\Windows Kits\10\Include\10.0.18362.0\ucrt. Copy the path to the ucrt folder.
Under the Environment Varialbes window. Underneath User variables for Admin select Path and click on the Edit button. Note if the Path variable doesnt exist in the User variables for Admin, then create it by clicking on New button.
Browse to the end of Variable value and ensure there is a ; present, if not add it by typing ; key on the keyboard. Paste the path that was copied to the ucrt folder. In other words paste C:\Program Files (x86)\Windows Kits\10\Include\10.0.18362.0\ucrt and add ; to indicate the end of the entry.
Add thisC:\swigwin-4.0.1\; and C:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools\VC\Auxiliary\Build; to the Variable value as well.
Restart your computer and now you should be ready to install the Open AI Gym as well as other required packages.

Presentation

References

Shiva Verma, Train Your Lunar-Lander | Reinforcement Learning | OpenAIGYM, https://towardsdatascience.com/solving-lunar-lander-openaigym-reinforcement-learning-785675066197
Shiva Verma, OpenAIGym, https://github.com/shivaverma/OpenAIGym/tree/master/lunar-lander/discrete
OpenAI, LunarLander-v2, https://gym.openai.com/envs/LunarLander-v2/

Name		Name	Last commit message	Last commit date
Latest commit History 89 Commits
assets		assets
gym @ 52e66f3		gym @ 52e66f3
modelweights		modelweights
presentation		presentation
.gitignore		.gitignore
.gitmodules		.gitmodules
Final_Presentation_COPY.pdf		Final_Presentation_COPY.pdf
Lunar_Lander_Keyboard_Play.ipynb		Lunar_Lander_Keyboard_Play.ipynb
README.md		README.md
Safe_Landings_Implementation.ipynb		Safe_Landings_Implementation.ipynb
test_setup.ipynb		test_setup.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Safe Landings In Deep Space

Disclaimer

Introduction

Background

Goal

Our Approach

Algorithm

Keras Model

Agent Learning History

Agent Testing

Folder Structure

Setup

Windows Instructions

Presentation

References

About

Releases

Packages

Contributors 2

Languages

npatel221/Lunar_Lander_RL

Folders and files

Latest commit

History

Repository files navigation

Safe Landings In Deep Space

Disclaimer

Introduction

Background

Goal

Our Approach

Algorithm

Keras Model

Agent Learning History

Agent Testing

Folder Structure

Setup

Windows Instructions

Presentation

References

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages