Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for MiniWorld (3D indoor environment)? #13

Open
maximecb opened this issue Nov 2, 2018 · 36 comments
Open

Support for MiniWorld (3D indoor environment)? #13

maximecb opened this issue Nov 2, 2018 · 36 comments

Comments

@maximecb
Copy link
Contributor

maximecb commented Nov 2, 2018

Hi Lucas,

I've been working on my 3D indoor environment. It's still very basic, but it works, and I just made the repository public: https://github.com/maximecb/gym-miniworld

I've tried to adjust your pytorch-a2c-ppo code to work with MiniWorld, but ran into issues. One is that MiniWorld produces observations which are not dictionary-based, and it was awkward to support this (the obs are just 60x80x3 RGB arrays). The other is that I only got something like 16 frames per second while training with 16 processes, and I have no idea why.

Would you have time to take a look? It would be great if it could work with your RL code out of the box. Right now I'm using my own fork of ikostrikov's, but there are multiple other issues with that code, one of which is that the performance when visualizing trained agents doesn't match the performance reported while training.

@lcswillems
Copy link
Owner

Hi Maxime,

This is a really great project!!!

For the first issue, is it still an issue after I modified the network to automatically adapt it to the size of the images?

Thank you for the issues. I will try your code as soon as possible.

@maximecb
Copy link
Contributor Author

maximecb commented Nov 4, 2018

Thanks Lucas. I will keep improving upon it :)

I tried the following comman for training:

python3 -m scripts.train --algo ppo --env MiniWorld-Hallway-v0 --procs 1 --no-instr --no-mem --save-interval 10

It produces the following error:

Traceback (most recent call last):
  File "/usr/lib/python3.6/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/usr/lib/python3.6/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/home/maximecb/Desktop/pytorch-a2c-ppo/scripts/train.py", line 111, in <module>
    preprocess_obss = utils.ObssPreprocessor(model_dir, envs[0].observation_space)
  File "/home/maximecb/Desktop/pytorch-a2c-ppo/utils/format.py", line 40, in __init__
    "image": obs_space.spaces['image'].shape,
AttributeError: 'Box' object has no attribute 'spaces'

Because it expects observations to have a dictionary observation space.

I think it would be good to support both dict and RGB observation spaces. This would allow your RL code to work not just with MiniWorld, but also with the Atari environments.

@lcswillems
Copy link
Owner

lcswillems commented Nov 6, 2018

If I assume that the obs_space is a dictionnary, it is because the MiniGrid's observation space is Dict(image:Box(7, 7, 3)). The easiest thing would be to change MiniGrid's observation space to be just Box(7, 7, 3). This way, the observation space of MiniGrid, MiniWorld and Atari would be Box(...) and hence, I will not have to do disjunctions.

@maximecb
Copy link
Contributor Author

maximecb commented Nov 6, 2018

Yes, if I had to do it again I would make the MiniGrid obs space just a numpy array, and pass the mission string through the info dict instead. I'm kind of reluctant to do that now though because some number of people are already using MiniGrid, and this change could break their code.

@lcswillems
Copy link
Owner

I see what you mean. So, you will never do changes that are no retro compatible?

Are their instructions in MiniWorld?

@maximecb
Copy link
Contributor Author

maximecb commented Nov 6, 2018

I will try to avoid breaking changes if I can.

Right now there are no instructions in MiniWorld, but there will be eventually. When there are instructions I will add them through the info dict instead, it seems like the better way to go with OpenAI Gym.

@lcswillems
Copy link
Owner

lcswillems commented Nov 7, 2018

I would like to do a clean code that would define a observation space based on the kind of environment, i.e. if it is a minigrid environment, then ..., else if it is a MiniWorld environment then, otherwise ... .

Do you have an idea of how to do that?

For example, is it possible to get the id of the environment? I don't find how to get it... If yes, then I could filter using regexs.

@lcswillems lcswillems changed the title Support for MinWorld (3D indoor environment)? Support for MiniWorld (3D indoor environment)? Nov 7, 2018
@maximecb
Copy link
Contributor Author

maximecb commented Nov 7, 2018

You could check if the observation space is an instance of Box or if it's an instance of Dict

https://github.com/openai/gym/blob/master/gym/spaces/dict_space.py

@lcswillems
Copy link
Owner

Thank you for the solution. But doing the disjunction on the observation space is not the good thing to do I think. It doesn't make the code easy to understand.

@maximecb
Copy link
Contributor Author

maximecb commented Nov 7, 2018

The good thing is that if you check based on the observation space, you could handle not just MiniWorld, but also all other environments which produce an RGB pixel array as observations (eg: Atari, VizDoom, etc).

@lcswillems
Copy link
Owner

What is the FPS you get with the RL code you actually use?

@maximecb
Copy link
Contributor Author

maximecb commented Nov 8, 2018

1600-2200FPS on a Core i7 3.5 with a Titan Xp, 16 processes.

@lcswillems
Copy link
Owner

lcswillems commented Nov 8, 2018

Sorry, I meant, what is the FPS you get with the other RL code you use (not mine)? https://github.com/maximecb/gym-miniworld/tree/master/pytorch-a2c-ppo-acktr

@maximecb
Copy link
Contributor Author

maximecb commented Nov 8, 2018

Yes that's what I meant.

@lcswillems
Copy link
Owner

lcswillems commented Nov 8, 2018

Sorry, I misread your answer... I thought you wrote 16 FPS... This confused me.

I will try to increase the FPS

@lcswillems
Copy link
Owner

Last thing, you used PPO right?

@maximecb
Copy link
Contributor Author

maximecb commented Nov 8, 2018

Yes, with this command to launch training:

python3 main.py --algo ppo --num-frames 5000000 --num-processes 16 --num-steps 80 --lr 0.00005 --env-name MiniWorld-Hallway-v0

@lcswillems
Copy link
Owner

Maybe we can continue the discussion on this issue.

Great that you got the graphics working. I will try to come up with a cleaner fix for that.

For the second issue, thankfully, I already have a workaround. You have to set the fork method to 'forkserver':

https://github.com/maximecb/gym-miniworld/blob/master/pytorch-a2c-ppo-acktr/vec_env/subproc_vec_env.py#L2
https://github.com/maximecb/gym-miniworld/blob/master/pytorch-a2c-ppo-acktr/vec_env/subproc_vec_env.py#L39

This changes something in the way resources are shared between subprocesses I believe.

A problem with this modifications is that it doesn't make MiniGrid working anymore.

@maximecb
Copy link
Contributor Author

I'm not sure what's happening, it seems like you're maybe trying to send information across processes that shouldn't be sent, like whole environments? Something that contains pointers? What's the error with MiniGrid?

@lcswillems
Copy link
Owner

lcswillems commented Nov 13, 2018

Sorry, I forgot to add the error. This is the end of the error:

  File "/mnt/Data/Documents/Education/2017 2018 - M1/Stage/Code/torch-rl/torch_rl/torch_rl/algos/base.py", line 49, in __init__
    self.env = ParallelEnv(envs)
  File "/mnt/Data/Documents/Education/2017 2018 - M1/Stage/Code/torch-rl/torch_rl/torch_rl/utils/penv.py", line 28, in __init__
    set_start_method('forkserver')
  File "/home/lcswillems/miniconda3/lib/python3.6/multiprocessing/context.py", line 242, in set_start_method
    raise RuntimeError('context has already been set')

I don't put the whole error because it is very huge.

I am investigating this page: pytorch/pytorch#3492

@maximecb
Copy link
Contributor Author

Ok. Seems like you might want to call set_start_method earlier, and make sure you are calling it only once?

@lcswillems
Copy link
Owner

But, even if I put this at the very beginning of my code, I still have the same error. I don't really understand what is happening...

@maximecb
Copy link
Contributor Author

I'm trying to make it work again. If it fails with multiple processes, I would say try to debug with one process.

I put the set_start_method at the beginning of the penv.py file, after the import:

from multiprocessing import Process, Pipe, set_start_method
import gym

set_start_method('forkserver')

I also just submitted a PR for a minor issue that showed up after I upgraded to PyTorch 1.0.

@maximecb
Copy link
Contributor Author

I have it training on the Hallway level with a single process. There's still some issue when the number of processes is greater than 1.

@maximecb
Copy link
Contributor Author

I got it to work.

The environments need to be created inside the worker processes rather than in the main process. To do this, I pass a make_env function to the processes, rather than the envs themselves. The function needs to be serialized with cloudpickle before it can be passed as an argument to the worker processes.

Also, to solve that error you were getting wrt set_start_method, the code in main.py needs to be wrapped inside a main function which is called as such:

if __name__ == "__main__":
    main()

Otherwise every process tries to run the code in main, and that's why things break down.

@lcswillems
Copy link
Owner

lcswillems commented Dec 21, 2018

Thank you! I will try to implement as soon as possible. It would be great if you can do a PR.

@lcswillems
Copy link
Owner

Hi Maxime,

I have some time again for this project. If you don't have the time for a PR, could you just send me the code that you modified (where you pickle and the make_env part)?

@maximecb
Copy link
Contributor Author

maximecb commented Apr 7, 2019

Hi Lucas,

Glad to hear you have time for this project.

The code I had modified to work with MiniWorld is here: https://github.com/maximecb/gym-miniworld/tree/maxime-torchrl/torch-rl

There might be some way to do a diff between this and your torch-ac repo to see exactly what's been changed.

@lcswillems
Copy link
Owner

lcswillems commented Apr 8, 2019

Thank you!

I cloned gym-miniworld and executed the following command (with 1 process):

python3 main.py --algo ppo --num-frames 5000000 --num-processes 1 --num-steps 80 --lr 0.00005 --env-name MiniWorld-Hallway-v0

and got 35 FPS. It sounds strange, isn't it?

I got this message:

Falling back to non-multisampled frame buffer
Falling back to num_samples=8
Falling back to non-multisampled frame buffer

Then, I executed:

python3 -m scripts.train --algo ppo --env MiniWorld-Hallway-v0 --procs 1 --save-interval 10

and got also 35 FPS.

Finally, I checked out your maxime-torchrl branch and executed:
python3 -m scripts.train --algo ppo --env MiniWorld-Hallway-v0 --procs 1 --save-interval 10

and got 100 FPS.

I don't know what to conclude from my experiments. The FPS seems abnormally low.

@maximecb
Copy link
Contributor Author

maximecb commented Apr 8, 2019

Yes these frame rates all seem very low, even for one process. The simulator is able to run much faster (you can try with ./benchmark.py to see how fast it can run on your machine).

I ran into the same problem the last time I tried it. I wasn't able to figure out what the issue was. I just remember it was way too slow when recurrence was activated. Worth doing some profiling.

@lcswillems
Copy link
Owner

This is the benchmark I have:

load time: 4917 ms
reset time: 362.6 ms
frame time: 2.5 ms
frame rate: 397.6 FPS

@maximecb
Copy link
Contributor Author

maximecb commented Apr 9, 2019

It could definitely run much faster on your machine then.

@lcswillems
Copy link
Owner

If I understand, it is slow because it takes a lot of time to reset?

How could I fix it? I don't see the link with cloudpickle, etc...

@maximecb
Copy link
Contributor Author

maximecb commented Apr 9, 2019

I don't think there is an issue with the environment because the ikostrikov RL code manages to run at 1000+ FPS on my desktop machine.

Is it possible that your code and the ikostrikov code manage resetting of the environment differently? Maybe your code blocks all processes every time the environment resets and theirs doesn't?

@lcswillems
Copy link
Owner

On my machine, the ikostrikov RL code is as slow as mine as written in my previous message... I can't reproduce the situation where its code is faster than mine...

@maximecb
Copy link
Contributor Author

maximecb commented Apr 9, 2019

With multiple processes or with just 1? I usually train with 16+.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants