You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Nov 4, 2019. It is now read-only.
The current A3C-feedforward policy is only using 4-frame history to predict policy and may fail in case the environment is still partially observable.
Would probably be nice to implement a policy with GRU/LSTM hidden units [instead of window or along with it].
A good proving ground would be a game with field of view like doom - DefendCenter or HealthGathering - just make sure that image preprocessing works fine with it.
Bonus kudos for implementing a soft attention mechanism that actually improves results [or does not harm and gives clues on what agent looks at]. linklink
The text was updated successfully, but these errors were encountered:
The current A3C-feedforward policy is only using 4-frame history to predict policy and may fail in case the environment is still partially observable.
Would probably be nice to implement a policy with GRU/LSTM hidden units [instead of window or along with it].
A good proving ground would be a game with field of view like doom - DefendCenter or HealthGathering - just make sure that image preprocessing works fine with it.
Bonus kudos for implementing a soft attention mechanism that actually improves results [or does not harm and gives clues on what agent looks at]. link link
The text was updated successfully, but these errors were encountered: