fixed-wing-gym/gym_fixed_wing/examples at master · eivindeb/fixed-wing-gym

History

Name		Name	Last commit message	Last commit date
parent directory ..
evaluations		evaluations
models		models
test_sets		test_sets
README.md		README.md
evaluate_controller.py		evaluate_controller.py
fixed_wing_config.json		fixed_wing_config.json
requirements.txt		requirements.txt
requirements_cnn.txt		requirements_cnn.txt
tensorboard.png		tensorboard.png
train_rl_controller.py		train_rl_controller.py

README.md

Installation

In addition to the dependencies of the gym environment itself, running these example scripts require Python 3.5-7, stable-baselines and tensorflow 1. These additional requirements can be installed by running the commands below. To use the CNN-MLP policy described in the paper requires my fork of the stable-baselines repository, which is specified in the requirements_cnn.txt file. Please note that the included MLP controller from the models folder is incompatible with this fork. For any installation errors from stable-baselines, please refer to the documentation.

pip install -r requirements.txt

OR

pip install -r requirements_cnn.txt

Evaluating controllers

The test_sets folder contains the four test sets used in the paper. Controllers can be evaluated on these sets by doing e.g.:

python evaluate_controller.py test_sets/test_set_wind_none_step20-20-3.npy --num-envs 4 --PID --env-config-path fixed_wing_config.json --turbulence-intensity "none"

python evaluate_controller.py test_sets/test_set_wind_moderate_step20-20-3.npy --num-envs 4 --model-path models/mlp_controller/model.pkl --turbulence-intensity "moderate"

Which will evaluate the PID controller on the test set with no wind or turbulence and the MLP controller on the moderate turbulence set, respectively. The model folder contains the CNN RL controller used in the paper, as well as an MLP RL controller usable with the default version of stable-baselines.

The included RL controllers and the PID controller were evaluated on these sets with PyFly v0.1.2 (commit #21f5b5c812330e1d5356d4b6b5fc774753839892), producing the results shown in the table below. To reproduce the results shown here, make sure to use this PyFly version.

			Success				Rise time			Settling time			Overshoot		Control variation
Setting	Controller	φ	θ	Va	All	φ	θ	Va	φ	θ	Va	φ	θ	Va
No turbulence	RL (CNN)	100	100	100	100	0.253	0.614	0.803	1.594	1.580	2.704	25	34	31	0.638
	RL (MLP)	100	100	100	100	1.395	0.336	0.959	2.085	1.675	2.308	5	25	20	0.410
	PID	100	100	100	100	1.337	0.226	1.016	2.018	1.294	2.203	3	9	29	0.291
Light turbulence	RL (CNN)	100	100	100	100	0.201	0.657	0.654	1.652	1.699	2.521	32	50	52	0.779
	RL (MLP)	100	100	100	100	1.238	0.423	0.884	2.062	1.845	2.419	6	28	37	0.851
	PID	100	100	100	100	1.132	0.291	0.967	2.008	1.364	2.225	7	11	38	0.476
Moderate turbulence	RL (CNN)	100	100	100	100	0.185	0.861	0.557	2.000	2.117	3.748	53	87	105	0.823
	RL (MLP)	98	97	97	97	0.890	0.680	0.643	2.799	2.927	3.660	89	68	91	1.279
	PID	100	98	97	93	0.835	0.406	0.739	2.131	1.674	2.920	22	22	82	0.702
Severe turbulence	RL (CNN)	98	98	97	97	0.148	1.492	0.349	2.232	2.458	6.146	90	152	226	0.885
	RL (MLP)	93	92	91	91	0.740	1.002	0.538	3.477	4.028	4.975	108	92	215	1.698
	PID	98	98	94	83	0.683	0.557	0.642	2.463	2.560	4.280	64	51	126	0.826

The outputs of the evaluation scripts can be found in the evaluations folder. They are in numpy format and contain the result dictionary, to process them in python do:

res = np.load("eval_res.npy", allow_pickle=True).item()

Evaluation results can also be outputted by supplying the evaluation results file path in place of the test set path along with the --print-results flag:

python evaluate_controller.py evaluations/eval_res_RL_CNN_severe.npy --print-results

Due to refactoring of the code base, non-determinism that has since been eliminated and various other changes in the gym environment, these are not the exact values reported in the paper, however, they support the same trends highlighted in the paper.

Note that the RL MLP controller does not represent any best efforts to produce an optimal controller, but rather the controller obtained by running the example training script below once.

Training controllers

To train a reinforcement learning controller, run the train_rl_controller.py script, e.g. to train an agent using 4 processes for 5 million time steps and evaluate on the no turbulence test set, do:

python train_rl_controller.py "ppo_example" 4 --test-set-path test_sets/test_set_wind_none_step20-20-3.npy

This script trains a PPO agent to do attitude control of a fixed-wing aircraft. It saves checkpoints of models, renders episodes during training so that its behavior can be inspected, runs periodic test set evaluations if a test set path is supplied, and logs all training information to tensorboard such that its progress can be monitored.

tensorboard --logdir models/ppo_example/tb

Citation

If you use this software in your work, please consider citing:

@inproceedings{bohn2019deep,
  title={Deep Reinforcement Learning Attitude Control of Fixed-Wing UAVs Using Proximal Policy optimization},
  author={B{\o}hn, Eivind and Coates, Erlend M and Moe, Signe and Johansen, Tor Arne},
  booktitle={2019 International Conference on Unmanned Aircraft Systems (ICUAS)},
  pages={523--533},
  year={2019},
  organization={IEEE}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

examples

examples

README.md

Installation

Evaluating controllers

Training controllers

Citation

Files

examples

Directory actions

More options

Directory actions

More options

Latest commit

History

examples

Folders and files

parent directory

README.md

Installation

Evaluating controllers

Training controllers

Citation