Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spiky CRMDP Roadmap #30

Open
16 of 20 tasks
jvmncs opened this issue Dec 16, 2018 · 0 comments
Open
16 of 20 tasks

Spiky CRMDP Roadmap #30

jvmncs opened this issue Dec 16, 2018 · 0 comments
Assignees
Labels

Comments

@jvmncs
Copy link
Owner

jvmncs commented Dec 16, 2018

Main road

  • Create toy environments
  • Refactor for use with Gym API (Gym #32)
  • Improved tooling for hyperparameter tuning (e.g. Ray)
  • Estimate compute costs and finalize logistics
    • First guess for an upper bound: 1 agent x 4 environments x 3 experiments = 12 sets of hyperparameters to tune x ~30 training runs = 360 runs x 2 hours
  • Do experiments Start with experiments January 11
    • Check if hparams tuned on Solver generalize to Cheater (vice versa too, but less important/rigorous)
  • Investigate corrupt versions of harder environments
    • Maybe bigger / more realistic boat race
    • Maybe a modified Atari env
    • Maybe a modified MuJoCo env
    • Maybe modified BipedalWalker env

Finish experiments February 15

Deadline February 22

Environments:

  • TomatoWateringCRMDP
  • TransitionBoatRaceCRMDP
  • Toy environments
    • corrupt corners (satisfies our assumptions for guaranteed learnability)
    • corrupt path to goal (does not satisfy assumptions for guaranteed learnability)

Experiments per env

  • Baseline (learns corrupt reward)
  • Cheater (learns with access to true reward)
  • Solver (learns intended behavior from corrupt reward)

Optional

@jvmncs jvmncs added the epic label Dec 16, 2018
@timorl timorl pinned this issue Dec 16, 2018
@jvmncs jvmncs mentioned this issue Dec 16, 2018
Merged
@jvmncs jvmncs closed this as completed Dec 31, 2018
@jvmncs jvmncs reopened this Dec 31, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants