LSTM/GRU agent #9

justheuristic · 2017-01-17T19:19:35Z

The current A3C-feedforward policy is only using 4-frame history to predict policy and may fail in case the environment is still partially observable.
Would probably be nice to implement a policy with GRU/LSTM hidden units [instead of window or along with it].

A good proving ground would be a game with field of view like doom - DefendCenter or HealthGathering - just make sure that image preprocessing works fine with it.

Bonus kudos for implementing a soft attention mechanism that actually improves results [or does not harm and gives clues on what agent looks at]. link link

justheuristic added the help wanted label Jan 17, 2017

justheuristic assigned feygina Jan 18, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LSTM/GRU agent #9

LSTM/GRU agent #9

justheuristic commented Jan 17, 2017 •

edited

Loading

LSTM/GRU agent #9

LSTM/GRU agent #9

Comments

justheuristic commented Jan 17, 2017 • edited Loading

justheuristic commented Jan 17, 2017 •

edited

Loading