Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The bandit tests are flaky #5

Open
NivenT opened this issue Aug 3, 2017 · 0 comments
Open

The bandit tests are flaky #5

NivenT opened this issue Aug 3, 2017 · 0 comments

Comments

@NivenT
Copy link
Owner

NivenT commented Aug 3, 2017

As a bare minimum for thinking a new RL algorithm was possible implemented correctly, it is given a test on the N-armed bandit problem. This environment is about as simple as RL environments get, and so every algorithm should be able to "solve" it w/o problem. This is currently not the case, as some environments (I think just CrossEntropy) do not consistently pass. More care needs to be taken in choosing hyperparameters here so tests aren't flaky.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant