We train all the algorithms on NTY Greene HPC.
sbatch scripts/ppo_train.sh ppo_lr05_ent15_cp2 0
For example, if you want to train the PPO algorithm in NYU HPC with the above command. ppo_lr05_ent15_cp2
means the learning rate 0
means seed is 0.