Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unexpected behavior of eval_every argument for ppo agent #60

Open
david-lindner opened this issue Feb 23, 2019 · 0 comments
Open

Unexpected behavior of eval_every argument for ppo agent #60

david-lindner opened this issue Feb 23, 2019 · 0 comments
Assignees
Labels
bug Something isn't working

Comments

@david-lindner
Copy link
Collaborator

After changing the way episodes are tracked for the ppo agent (#58), the eval_every argument does not longer work as expected. If I run for example:

python main.py -E 10000 -V 50 -EE 10 -C -L runs/way/ppo-cnn/cheat way ppo-cnn -e 10 -r 200 -l 1e-3

I get no evaluations at all. The problem is two-fold:

  1. -EE is being compared to history["episodes"] which for the ppo agent now measures episodes as determined by the -E argument times number of rollouts. In this case, I run 10000 'episodes' but for every episode history["episodes"] is incremented by 200. This means to get the desired behavior, I have to set -EE 2000. This is very unexpected behavior and inconsistent with the other agents.

  2. Even when setting -EE 2000, I still get no evaluations. This seems to be caused by the check that is performed to determine whether we should evaluate:

if history["episode"] % args.eval_every == args.eval_every - 1:
        eval_next = True

This does not work anymore because history["episode"] does no longer increase in increments of 1 between these checks. So I will have the following:

...
history["episode"] = 1800     history["episode"] % args.eval_every = 1800
history["episode"] = 2000     history["episode"] % args.eval_every = 0
history["episode"] = 2200     history["episode"] % args.eval_every = 200
...

and eval_next will always be False. I now changed the check to the following as a quick fix:

if history["episode"] % args.eval_every == 0 and history["episode"] > 0:
        eval_next = True

This works for the experiments, I want to run. However, I am not sure if this is a good solution in general and we should discuss what to do here.

@david-lindner david-lindner added the bug Something isn't working label Feb 23, 2019
david-lindner added a commit that referenced this issue Feb 23, 2019
jvmncs added a commit that referenced this issue Feb 23, 2019
* fixing evaluation check (cf. #60)

* make LR required

* add boat transition env to parsing code, clean the file

* moving ppo crmdp to proper place

* fixing shapes

* assume pip > 1.18.1, rm dependency_links
@jvmncs jvmncs closed this as completed Jun 2, 2019
@jvmncs jvmncs reopened this Jun 2, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants