[ 📺 Website | 🏗 Github Repo | 🎓 Paper ]
-
Gym >= 0.8.1
-
Mujoco-py >= 0.5.7
-
Tensorflow >= 1.0.1
-
Mujoco
Add the following path to ~/.bashrc
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$HOME/.mujoco/mujoco200/bin export LD_LIBRARY_PATH=$HOME/.mujoco/mjpro150/bin:$LD_LIBRARY_PATH
Follow instructions to install mujoco_py v1.5 here.
-
SenseAct (Optional)
SenseAct uses Python3 (>=3.5), and all other requirements are automatically installed via pip.
On Linux and Mac OS X, run the following:
git clone https://github.com/kindredresearch/SenseAct.git cd SenseAct pip install -e .
- Collect demonstration data and save to
expert_data
directory.
The expert data should be a python pickle file (with .bin
but not .pkl
as a suffix) It has batch_size
, action
, states
(required by set_er_stats()), like the expert_data/hopper_er.bin
(just as an example).
- Training
COUNTER=1
ENVS+='Ant-v2'
for ENV_ID in ${ENVS[@]}
do
CUDA_VISIBLE_DEVICES=`expr $COUNTER % 4` python main.py --env_name $ENV_ID --alg mairlImit --obs_mode state &
COUNTER=$((COUNTER+1))
done
- Evaluation
CUDA_VISIBLE_DEVICES=0 python main.py --env_name Ant-v2 --train_mode False
Our code is based on itaicaspi/mgail, HumanCompatibleAI/imitation, huggingface/transformers.
Adversarial Inverse Reinforcement Learning with Self-attention Dynamics Model
Jiankai Sun,
Lantao Yu,
Pinqian Dong,
Bo Lu, and
Bolei Zhou
In IEEE Robotics and Automation Letters (RA-L) 2021
[Paper]
[Project Page]
@ARTICLE{sun2021adversarial,
author={J. {Sun} and L. {Yu} and P. {Dong} and B. {L} and B. {Zhou}},
journal={IEEE Robotics and Automation Letters},
title={Adversarial Inverse Reinforcement Learning with Self-attention Dynamics Model},
year={2021},
}