Skip to content

Latest commit

 

History

History
48 lines (28 loc) · 2.56 KB

README.md

File metadata and controls

48 lines (28 loc) · 2.56 KB

Source code of "Safety Filters for Black-Box Dynamical Systems by Learning Discriminating Hyperplanes"

Will Lavanakul*, Jason J. Choi*, Koushil Sreenath, Claire J. Tomlin

University of California, Berkeley


To import the conda environment, run: conda env create -f environment.yml


All code for SL-DH is found in SL-DH/ and all code for RL-DH can be found in RL-DH/.


Modules for SL-DH:

controllers contain different controllers including point-follow for car, and the QP safe controller.

envs provides implementations of cartpole, kinematic car, and inverted pendulum.

filters provides classes used to provide the hyperplane constraint for the QP controller.

inv_set contains all control invariant sets used for the experiments.

The following run each SL-DH experiment from the paper. All hyperparameters are set to the ones used in the paper:

  1. Inverted Pendulum: Assign all hyperparameters in the script inv_ped_train.py. Run python inv_ped_train.py
  2. Kinematic Car: Assign all hyperparameters in the script four_car_train.py. Run python four_car_train.py., To visualize different lookahead times and animation, set plot values in plot_four_car.py. Run python plot_four_car.py.
  3. Jet Engine: Inverted Pendulum: Assign all hyperparameters in the script jet_train.py. Run python jet_train.py

Modules for RL-DH:

ppo.py contains code for training agents for CartPole and HalfCheetah. ppo.py currently allows PPO, PPO Lagrangian, PPO with SL-DH, and PPO with RL-DH. Note that PPO with SL-DH only works for CartPole and not HalfCheetah. This codebase is built on top of the following repository: https://github.com/akjayant/PPO_Lagrangian_PyTorch?tab=readme-ov-file.

See training scripts to create SL-DH controllers. The main algorithm can be seen in the training loop in all training scripts.

The following run each RL-DH experiment from the paper. Refer to the paper for hyperparameters. For env

  1. PPO: python ppo.py --env {env} --exp_name ppo --steps {steps} --seed {seed}
  2. PPO Lagrangian: python ppo.py --env {env} --exp_name ppo_lag --steps {steps} --seed {seed}
  3. PPO with SL-DH: python ppo.py --env CartPole --exp_name ppo_nn_qp_ --steps {steps} --seed {seed}
  4. PPO with RL-DH: python ppo.py --env {env} --exp_name pret_ppo --seed {seed} --steps {steps} --pret_dir {pret_dir}. pret_dir is the directory containing a pretrained RL-DH safety filter (see next steps). Usually saved in data/fppo/exp_name/pyt_save

Pretraining RL-DH safety filter:

  1. Run python fppo.py --env {env} --seed {seed} --exp_name {exp} --steps {steps}