Multi_car_racing and Domain Randomization Progress #18

RPegoud · 2024-02-29T17:14:49Z

Here's the current state of my work on multi_car_racing and Domain Randomization:

Installation:

Running the script still requires Docker for now, I also copied the multi_car_racing repository for convenience (for some reason installing and importing as usual didn't work), this will be cleaned up later when we are ready to move on to later stages of the project
I upgraded the pettingZoo version to 1.23 and supersuit to 3.7.2 as pettingZoo < 1.23 had a typo preventing an import (BaseParallelWraper renamed to BaseParallelWrapper)
There was a circular import in curriculum_sync_wrapper.py

from syllabus.core import Curriculum, decorate_all_functions # circular import

from syllabus.core import Curriculum
from .utils import decorate_all_functions  # fixed the problem

Script:

The task wrapper for multi_car_racing seems to work as expected
The curriculum setup is executed without any error:

env = MultiCarRacingParallelWrapper(env=env, n_agents=n_agents)
curriculum = DomainRandomization(env.task_space)
curriculum, task_queue, update_queue = make_multiprocessing_curriculum(curriculum)

However, I'm still unsure of how to update the DR curriculum compared to PLR:

# TODO: adapt to DR
if global_cycles % num_steps == 0:
    update = {
        "update_type": "on_demand",
        "metrics": {
            "action_log_dist": logprobs,
            "value": values,
            "next_value": (
                agent.get_value(next_obs)
                if step == num_steps - 1
                else None
            ),
            "rew": rb_rewards[step],
            "masks": torch.Tensor(1 - np.array(list(dones.values()))),
            "tasks": [env.unwrapped.task],
        },
    }
    curriculum.update_curriculum(update)

Finally, I attempted to implement continuous PPO by using this cleanrl ppo_continuous_action architecture along with the cleanrl_pettingzoo_pistonball_plr training script (with minor adjustments).

I'm still solving bugs and progressing through the script. For now it seems that the end_step variable prevents the loop containing the backward pass to run (end_step is equal to 0, therefore b_obs = torch.flatten(rb_obs[:end_step], start_dim=0, end_dim=1) is empty and for start in range(0, len(b_obs), batch_size) doesn't iterate properly).

Could this be due to the fact that the pistonball_plr script is unfinished ? (In hindsight I should've chosen a training loop that wasn't in the experimental folder)

… issues)

RyanNavillus · 2024-02-29T18:37:14Z

syllabus/examples/experimental/multi_car_racing.py

+    )
+
+    """ CURRICULUM SETUP """
+    env = MultiCarRacingParallelWrapper(env=env, n_agents=n_agents)


You'll need to wrap the environment in a PettingZooMultiProcessingSyncWrapper and then you should be done setting up Syllabus.

RyanNavillus · 2024-02-29T18:38:42Z

syllabus/examples/experimental/multi_car_racing.py

+                            "tasks": [env.unwrapped.task],
+                        },
+                    }
+                    curriculum.update_curriculum(update)


You don't need any of this curriculum code for Domain Randomization. You do need it for PLR, but I'm working on a new version of PLR that will cut this out, hopefully done in a few days.

RPegoud added 4 commits February 27, 2024 11:38

working multi car parallel wrapper (fixed circular import, dependency…

13b93e1

… issues)

added wrapped reset function

e915118

added domain randomization wrapper, started implementing continuous ppo

82130d1

Added continuous PPO training loop (one bug remaining, see TODOs)

a00a143

RyanNavillus reviewed Feb 29, 2024

View reviewed changes

.

5b88b0c

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multi_car_racing and Domain Randomization Progress #18

Multi_car_racing and Domain Randomization Progress #18

RPegoud commented Feb 29, 2024

RyanNavillus Feb 29, 2024

RyanNavillus Feb 29, 2024

Multi_car_racing and Domain Randomization Progress #18

Are you sure you want to change the base?

Multi_car_racing and Domain Randomization Progress #18

Conversation

RPegoud commented Feb 29, 2024

Installation:

Script:

RyanNavillus Feb 29, 2024

Choose a reason for hiding this comment

RyanNavillus Feb 29, 2024

Choose a reason for hiding this comment