This project was conducted for University of Toronto - School of Continuing Studies (SCS) as part of the Intelligent Agents & Reinforcement Learning - 3547 Course.
Submitted By:
- Adnan Lanewala
- Nareshkumar Patel
- Nisarg Patel
Rocket trajectory optimization is a classic topic in Optimal Control.
According to Pontryagin's maximum principle it's optimal to fire engine full throttle or turn it off. That's the reason this environment is OK to have discreet actions (engine on or off).
Landing pad is always at coordinates (0,0). Coordinates are the first two numbers in state vector. Reward for moving from the top of the screen to landing pad and zero speed is about 100..140 points. If lander moves away from landing pad it loses reward back. Episode finishes if the lander crashes or comes to rest, receiving additional -100 or +100 points. Each leg ground contact is +10. Firing main engine is -0.3 points each frame. Firing side engine is -0.03 points each frame. Solved is 200 points. Landing outside landing pad is possible. Fuel is infinite, so an agent can learn to fly and then land on its first attempt. Four discrete actions available: do nothing, fire left orientation engine, fire main engine, fire right orientation engine.
Navigate a lander to its landing pad safely and have a safe touch down. A sample of heuristic landing is shown below.
We used the Q-Learning Algorithm with a Deep Neural Network developed in Keras library. This algorithm is also known as the DQN algorithm. For more information on the algorithm please check out the paper Playing Atari with Deep Reinforcement Learning
We utilized the keras library to create a neural network that has state space as the input layer and output layer is the actions recommended by the agent. The input layer has 8 nodes, then there is a hidden layer with 100 nodes and another hidden layer with 50 nodes. All hidden layers use the relu
activation function. The output layer is a linear activation function with 4 nodes representing 4 discrete actions.
EPISODE = 10
EPISODE = 50
EPISODE = 100
EPISODE = 150
EPISODE = 200
Reward Vs. Episode Plot
Once the agent was fully trained to land safely, the weights along with the Keras model configuration was stored in a modelweights
directory such that it can be later retrieved and we wont need to re-train the agent from scratch. The agent was tested for 100
episodes. Below is a sample of landing carried out by a fully trained agent.
Reward Vs. Episode Plot
SCS-RL-3547-Final-Project
│ assets (Git README images store directory)
│ gym (Open AI Gym environment)
│ modelweights (model history)
│ │ LunarLander.h5 (keras model file)
│ presentation
│ │ Safe_Landings_In_Deep_Space_Presentation.ppsx (Presentation show file)
│ │ Safe_Landings_In_Deep_Space_Presentation.pptx (Powerpoint file)
│ Lunar_Lander_Keyboard_Play.ipynb (Human Keyboard input game play)
│ REAMDE.md (readme file)
│ Safe_Landings_Implementation.ipynb (Deep Q-Learning algorithm implementation for AI Agent)
│ test_setup.ipynb (Jupyter notebook to test and check the gym environment along with LunarLander-V2)
Libraries Used:
Open AI Gym
(v0.15.4) - https://github.com/openai/gymOpen AI Box2D-py
(v2.3.8) (Comes with gym) - https://github.com/openai/box2d-pyKeras
(v2.2.5) - https://pypi.org/project/Keras/
Anaconda Packages:
Swig
(v3.0.12) (Anaconda Package) - https://anaconda.org/anaconda/swigpystan
(v2.19.1.1) (Anaconda Package) - https://anaconda.org/conda-forge/pystanpyglet
(v1.4.8) (Anaconda Package) - https://anaconda.org/conda-forge/pyglet
Note: If you see errors during installation in regards to `Mujoco.py` kindly ignore it as we will not be using the `MUJOCO` environment. If you are windows user please check out the Windows Instructions to make sure you have the necessary setup before installing any of the above libraries as you may run into errors.
Install & Setup SWIG
- Go to http://www.swig.org/download.html to get the latest version of the
SWIG
library or go to https://sourceforge.net/projects/swig/files/swigwin/swigwin-4.0.1/swigwin-4.0.1.zip/download?use_mirror=iweb to download v4.0.1 directly. - Make sure to download
swigwin-x.x.x
for windows as indicated by thewin
in the file name. Please notex
indicates the version number. - Open & Extract the
swigwin-4.0.1.zip
folder. - Then go to the extracted
swigwin-4.0.1
folder. - Make sure there is
swig.exe
file inside theswigwin-4.0.1
folder. - Copy the entire
swigwin-4.0.1
folder inside theC:\
drive.
Install & Setup Microsoft Visual C++ Build Tools for Visual Studio 2019
- Go to https://visualstudio.microsoft.com/downloads/
- Scroll down to the bottom of the page and click on
Tools for Visual Studio 2019
. - Then browse to
Build Tools for Visual Studio 2019
and click on the Download button to download the file. - Once you've downloaded it, run the executable file to begin the installation.
- Once the application opens up, click on the
Workloads
tab. - Select the
C++ build tools
by clicking on the checkbox and begin the installation.
Setup Windows Environment Variable
- Open the Windows Environment Variable setting window by running the following command in Run Window
rundll32 sysdm.cpl,EditEnvironmentVariables
. Alternatively it can be opened by going toStart
-> Right Click onMy Computer
-> SelectProperties
-> Click onAdvanced system settings
-> Click onAdvanced
tab -> Click onEnvironment Variables
button. - In a new window go to
C:\Program Files (x86)\Windows Kits\10\Include
, then open the folder it maybe similar to10.0.18362.0
. Underneath that folder, openucrt
. Your Path in the Explorer should be similar toC:\Program Files (x86)\Windows Kits\10\Include\10.0.18362.0\ucrt
. Copy the path to theucrt
folder. - Under the
Environment Varialbes
window. UnderneathUser variables for Admin
selectPath
and click on theEdit
button. Note if thePath
variable doesnt exist in theUser variables for Admin
, then create it by clicking onNew
button. - Browse to the end of
Variable value
and ensure there is a;
present, if not add it by typing;
key on the keyboard. Paste the path that was copied to theucrt
folder. In other words pasteC:\Program Files (x86)\Windows Kits\10\Include\10.0.18362.0\ucrt
and add;
to indicate the end of the entry. - Add this
C:\swigwin-4.0.1\;
andC:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools\VC\Auxiliary\Build;
to theVariable value
as well. - Restart your computer and now you should be ready to install the Open AI Gym as well as other required packages.
- Shiva Verma, Train Your Lunar-Lander | Reinforcement Learning | OpenAIGYM, https://towardsdatascience.com/solving-lunar-lander-openaigym-reinforcement-learning-785675066197
- Shiva Verma, OpenAIGym, https://github.com/shivaverma/OpenAIGym/tree/master/lunar-lander/discrete
- OpenAI, LunarLander-v2, https://gym.openai.com/envs/LunarLander-v2/