Comparision between PPO, SAC and TD3 #83
SuhanNShetty
started this conversation in
General
Replies: 1 comment 2 replies
-
In the following reedit post you can find some answers: Isaac Gym with Off-policy Algorithms In the ddpg_td3_sac.zip file you can find the code for training DDPG, TD3 and SAC in the NVIDIA Omniverse Isaac Gym Ant environment. Note that compared with PPO (that is trained in 4096 parallel environments), DDPG, TD3 and SAC are configured to be trained in only 64 environments. You need to call the scripts as follow: PATH_TO_ISAAC_SIM/python.sh ant_ddpg.py headless=True num_envs=64
PATH_TO_ISAAC_SIM/python.sh ant_td3.py headless=True num_envs=64
PATH_TO_ISAAC_SIM/python.sh ant_sac.py headless=True num_envs=64 Regarding the execution time, PPO training (in 4096 parallel environments) runs about 16 times faster than DDPG/TD3/SAC training (in 64 parallel environments). You should see results similar to the following: |
Beta Was this translation helpful? Give feedback.
2 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi,
First of all, thanks for what looks like a fantastic library for RL. I am looking forward to testing it.
I have a question related to comparing the performance of SAC and TD3 with PPO.
In the examples section (https://skrl.readthedocs.io/en/latest/intro/examples.html), for Isaac Gym environments, all the examples for different RL environments only used PPO as the RL algorithm and use SequentialTrainer. I am curious to know why only PPO is being used there. Why not SAC or TD3? Is there a comparison of how other state-of-art algorithms such as SAC or TD3 work for different complicated tasks in your implementation? Do these off-policy algorithms work (w.r.t. reward and wall-clock time) as well as PPO in your implementation?
One of the main reasons I was interested in using this library is that I could use different RL algorithms (other than PPO) with NVIDIA Isaac Gym environments (unlike other libraries such as rl_games). So I am looking forward to your answer to the above question.
Thanks,
Suhan
Beta Was this translation helpful? Give feedback.
All reactions