Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a characteristic for solvers using action masks and make use of it in rollout #445

Merged
merged 3 commits into from
Dec 16, 2024

Conversation

nhuet
Copy link
Contributor

@nhuet nhuet commented Nov 29, 2024

  • Use it in rollout to make them be aware of current action mask, by calling their retrieve_applicable_actions() method.
  • Add a get_action_mask() method to domains by default converting applicable actions space into a 0-1 numpy array, provided that the action space of each agent is an EnumerableSpace.
  • Use these new features to simplify how the RayRLlib solver handles action masking:
    • inherit from Maskable
    • do not require anymore FullObservable from the domain to use action
      masking, as get_action_mask() can be called without the solver knowing about
      the current state (and since in rollout, the actual domain is now
      used)
    • decide whether using action masking directly in __init__() so that
      using_applicable_actions() can be overriden properly
    • use common functions for unwrap_obs and wrap_action in solver and
      wrapper environment to avoid code duplication
    • use domain.get_action_mask() to convert applicable actions into a mask
      (the method is more efficient as not calling get_applicable_actions()
      for each actions)

@nhuet nhuet marked this pull request as draft December 10, 2024 16:31
@nhuet nhuet force-pushed the rollout-action-mask branch from f8826b1 to 3c73d1d Compare December 12, 2024 16:24
@nhuet nhuet changed the title Add option in rollout for sample_action kwargs (e.g. action masking) Add a characteristic for solvers using action masks and make use of it in rollout Dec 12, 2024
@nhuet nhuet marked this pull request as ready for review December 12, 2024 16:30
- Use it in rollout to make them be aware of current action mask.
- Add a `get_action_mask()` method to domains by default converting
  applicable actions space into a 0-1 numpy array, provided that the
  action space of each agent is an EnumerableSpace.
- inherits from Maskable
- do not require anymore FullObservable from the domain to use action
  masking, as get_action_mask() can be called without the solver knowing about
  the current state (and since in rollout, the actual domain is now
  used)
- decide whether using action masking directly in __init__() so that
  using_applicable_actions() can be overriden properly
- use common functions for unwrap_obs and wrap_action in solver and
  wrapper environment to avoid code duplication
- use domain.get_action_mask() to convert applicable actions into a mask
  (the method is more efficient as not calling get_applicable_actions()
  for each actions)
This is more memory sufficient for only 0-1's.
And seems to be the standard for action mask at least for ray.rllib,
as shown in `action_mask_key` documentation at
https://docs.ray.io/en/latest/rllib/rllib-training.html
@nhuet nhuet force-pushed the rollout-action-mask branch from 3c73d1d to 85f6c59 Compare December 13, 2024 12:36
Copy link
Collaborator

@neo-alex neo-alex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great to have proper masking implemented, thank you! LGTM

@neo-alex neo-alex merged commit 491d3a1 into airbus:master Dec 16, 2024
26 of 33 checks passed
@nhuet nhuet deleted the rollout-action-mask branch January 20, 2025 09:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants