Support for MiniWorld (3D indoor environment)? #13

maximecb · 2018-11-02T17:25:33Z

Hi Lucas,

I've been working on my 3D indoor environment. It's still very basic, but it works, and I just made the repository public: https://github.com/maximecb/gym-miniworld

I've tried to adjust your pytorch-a2c-ppo code to work with MiniWorld, but ran into issues. One is that MiniWorld produces observations which are not dictionary-based, and it was awkward to support this (the obs are just 60x80x3 RGB arrays). The other is that I only got something like 16 frames per second while training with 16 processes, and I have no idea why.

Would you have time to take a look? It would be great if it could work with your RL code out of the box. Right now I'm using my own fork of ikostrikov's, but there are multiple other issues with that code, one of which is that the performance when visualizing trained agents doesn't match the performance reported while training.

The text was updated successfully, but these errors were encountered:

lcswillems · 2018-11-02T22:28:32Z

Hi Maxime,

This is a really great project!!!

For the first issue, is it still an issue after I modified the network to automatically adapt it to the size of the images?

Thank you for the issues. I will try your code as soon as possible.

maximecb · 2018-11-04T21:54:01Z

Thanks Lucas. I will keep improving upon it :)

I tried the following comman for training:

python3 -m scripts.train --algo ppo --env MiniWorld-Hallway-v0 --procs 1 --no-instr --no-mem --save-interval 10

It produces the following error:

Traceback (most recent call last):
  File "/usr/lib/python3.6/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/usr/lib/python3.6/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/home/maximecb/Desktop/pytorch-a2c-ppo/scripts/train.py", line 111, in <module>
    preprocess_obss = utils.ObssPreprocessor(model_dir, envs[0].observation_space)
  File "/home/maximecb/Desktop/pytorch-a2c-ppo/utils/format.py", line 40, in __init__
    "image": obs_space.spaces['image'].shape,
AttributeError: 'Box' object has no attribute 'spaces'

Because it expects observations to have a dictionary observation space.

I think it would be good to support both dict and RGB observation spaces. This would allow your RL code to work not just with MiniWorld, but also with the Atari environments.

lcswillems · 2018-11-06T18:01:09Z

If I assume that the obs_space is a dictionnary, it is because the MiniGrid's observation space is Dict(image:Box(7, 7, 3)). The easiest thing would be to change MiniGrid's observation space to be just Box(7, 7, 3). This way, the observation space of MiniGrid, MiniWorld and Atari would be Box(...) and hence, I will not have to do disjunctions.

maximecb · 2018-11-06T18:11:20Z

Yes, if I had to do it again I would make the MiniGrid obs space just a numpy array, and pass the mission string through the info dict instead. I'm kind of reluctant to do that now though because some number of people are already using MiniGrid, and this change could break their code.

lcswillems · 2018-11-06T18:34:33Z

I see what you mean. So, you will never do changes that are no retro compatible?

Are their instructions in MiniWorld?

maximecb · 2018-11-06T18:39:11Z

I will try to avoid breaking changes if I can.

Right now there are no instructions in MiniWorld, but there will be eventually. When there are instructions I will add them through the info dict instead, it seems like the better way to go with OpenAI Gym.

lcswillems · 2018-11-07T18:55:21Z

I would like to do a clean code that would define a observation space based on the kind of environment, i.e. if it is a minigrid environment, then ..., else if it is a MiniWorld environment then, otherwise ... .

Do you have an idea of how to do that?

For example, is it possible to get the id of the environment? I don't find how to get it... If yes, then I could filter using regexs.

maximecb · 2018-11-07T19:20:58Z

You could check if the observation space is an instance of Box or if it's an instance of Dict

https://github.com/openai/gym/blob/master/gym/spaces/dict_space.py

lcswillems · 2018-11-07T19:35:10Z

Thank you for the solution. But doing the disjunction on the observation space is not the good thing to do I think. It doesn't make the code easy to understand.

maximecb · 2018-11-07T19:37:28Z

The good thing is that if you check based on the observation space, you could handle not just MiniWorld, but also all other environments which produce an RGB pixel array as observations (eg: Atari, VizDoom, etc).

lcswillems · 2018-11-07T22:48:32Z

What is the FPS you get with the RL code you actually use?

maximecb · 2018-11-08T01:49:17Z

1600-2200FPS on a Core i7 3.5 with a Titan Xp, 16 processes.

lcswillems · 2018-11-08T13:12:26Z

Sorry, I meant, what is the FPS you get with the other RL code you use (not mine)? https://github.com/maximecb/gym-miniworld/tree/master/pytorch-a2c-ppo-acktr

maximecb · 2018-11-08T13:19:52Z

Yes that's what I meant.

lcswillems · 2018-11-08T13:31:37Z

Sorry, I misread your answer... I thought you wrote 16 FPS... This confused me.

I will try to increase the FPS

lcswillems · 2018-11-08T13:44:44Z

Last thing, you used PPO right?

maximecb · 2018-11-08T14:05:44Z

Yes, with this command to launch training:

python3 main.py --algo ppo --num-frames 5000000 --num-processes 16 --num-steps 80 --lr 0.00005 --env-name MiniWorld-Hallway-v0

lcswillems · 2018-11-13T20:13:27Z

Maybe we can continue the discussion on this issue.

Great that you got the graphics working. I will try to come up with a cleaner fix for that.

For the second issue, thankfully, I already have a workaround. You have to set the fork method to 'forkserver':

https://github.com/maximecb/gym-miniworld/blob/master/pytorch-a2c-ppo-acktr/vec_env/subproc_vec_env.py#L2
https://github.com/maximecb/gym-miniworld/blob/master/pytorch-a2c-ppo-acktr/vec_env/subproc_vec_env.py#L39

This changes something in the way resources are shared between subprocesses I believe.

A problem with this modifications is that it doesn't make MiniGrid working anymore.

maximecb · 2018-11-13T20:21:28Z

I'm not sure what's happening, it seems like you're maybe trying to send information across processes that shouldn't be sent, like whole environments? Something that contains pointers? What's the error with MiniGrid?

lcswillems · 2018-11-13T20:26:07Z

Sorry, I forgot to add the error. This is the end of the error:

  File "/mnt/Data/Documents/Education/2017 2018 - M1/Stage/Code/torch-rl/torch_rl/torch_rl/algos/base.py", line 49, in __init__
    self.env = ParallelEnv(envs)
  File "/mnt/Data/Documents/Education/2017 2018 - M1/Stage/Code/torch-rl/torch_rl/torch_rl/utils/penv.py", line 28, in __init__
    set_start_method('forkserver')
  File "/home/lcswillems/miniconda3/lib/python3.6/multiprocessing/context.py", line 242, in set_start_method
    raise RuntimeError('context has already been set')

I don't put the whole error because it is very huge.

I am investigating this page: pytorch/pytorch#3492

maximecb · 2018-11-13T20:32:01Z

Ok. Seems like you might want to call set_start_method earlier, and make sure you are calling it only once?

lcswillems · 2018-11-13T20:44:25Z

But, even if I put this at the very beginning of my code, I still have the same error. I don't really understand what is happening...

maximecb · 2018-12-20T20:53:02Z

I'm trying to make it work again. If it fails with multiple processes, I would say try to debug with one process.

I put the set_start_method at the beginning of the penv.py file, after the import:

from multiprocessing import Process, Pipe, set_start_method
import gym

set_start_method('forkserver')

I also just submitted a PR for a minor issue that showed up after I upgraded to PyTorch 1.0.

maximecb · 2018-12-20T21:07:27Z

I have it training on the Hallway level with a single process. There's still some issue when the number of processes is greater than 1.

maximecb · 2018-12-21T00:19:35Z

I got it to work.

The environments need to be created inside the worker processes rather than in the main process. To do this, I pass a make_env function to the processes, rather than the envs themselves. The function needs to be serialized with cloudpickle before it can be passed as an argument to the worker processes.

Also, to solve that error you were getting wrt set_start_method, the code in main.py needs to be wrapped inside a main function which is called as such:

if __name__ == "__main__":
    main()

Otherwise every process tries to run the code in main, and that's why things break down.

lcswillems · 2018-12-21T00:45:37Z

Thank you! I will try to implement as soon as possible. It would be great if you can do a PR.

lcswillems · 2019-04-07T09:34:15Z

Hi Maxime,

I have some time again for this project. If you don't have the time for a PR, could you just send me the code that you modified (where you pickle and the make_env part)?

maximecb · 2019-04-07T17:35:37Z

Hi Lucas,

Glad to hear you have time for this project.

The code I had modified to work with MiniWorld is here: https://github.com/maximecb/gym-miniworld/tree/maxime-torchrl/torch-rl

There might be some way to do a diff between this and your torch-ac repo to see exactly what's been changed.

lcswillems · 2019-04-08T19:23:11Z

Thank you!

I cloned gym-miniworld and executed the following command (with 1 process):

python3 main.py --algo ppo --num-frames 5000000 --num-processes 1 --num-steps 80 --lr 0.00005 --env-name MiniWorld-Hallway-v0

and got 35 FPS. It sounds strange, isn't it?

I got this message:

Falling back to non-multisampled frame buffer
Falling back to num_samples=8
Falling back to non-multisampled frame buffer

Then, I executed:

python3 -m scripts.train --algo ppo --env MiniWorld-Hallway-v0 --procs 1 --save-interval 10

and got also 35 FPS.

Finally, I checked out your maxime-torchrl branch and executed:
python3 -m scripts.train --algo ppo --env MiniWorld-Hallway-v0 --procs 1 --save-interval 10

and got 100 FPS.

I don't know what to conclude from my experiments. The FPS seems abnormally low.

maximecb · 2019-04-08T20:02:11Z

Yes these frame rates all seem very low, even for one process. The simulator is able to run much faster (you can try with ./benchmark.py to see how fast it can run on your machine).

I ran into the same problem the last time I tried it. I wasn't able to figure out what the issue was. I just remember it was way too slow when recurrence was activated. Worth doing some profiling.

lcswillems · 2019-04-08T21:15:22Z

This is the benchmark I have:

load time: 4917 ms
reset time: 362.6 ms
frame time: 2.5 ms
frame rate: 397.6 FPS

maximecb · 2019-04-09T00:38:51Z

It could definitely run much faster on your machine then.

lcswillems · 2019-04-09T07:38:05Z

If I understand, it is slow because it takes a lot of time to reset?

How could I fix it? I don't see the link with cloudpickle, etc...

maximecb · 2019-04-09T14:25:15Z

I don't think there is an issue with the environment because the ikostrikov RL code manages to run at 1000+ FPS on my desktop machine.

Is it possible that your code and the ikostrikov code manage resetting of the environment differently? Maybe your code blocks all processes every time the environment resets and theirs doesn't?

lcswillems · 2019-04-09T19:19:32Z

On my machine, the ikostrikov RL code is as slow as mine as written in my previous message... I can't reproduce the situation where its code is faster than mine...

maximecb · 2019-04-09T20:09:46Z

With multiple processes or with just 1? I usually train with 16+.

lcswillems changed the title ~~Support for MinWorld (3D indoor environment)?~~ Support for MiniWorld (3D indoor environment)? Nov 7, 2018

Support for MiniWorld (3D indoor environment)? #13

Support for MiniWorld (3D indoor environment)? #13

Comments

maximecb commented Nov 2, 2018

lcswillems commented Nov 2, 2018

maximecb commented Nov 4, 2018 • edited Loading

lcswillems commented Nov 6, 2018 • edited Loading

maximecb commented Nov 6, 2018

lcswillems commented Nov 6, 2018

maximecb commented Nov 6, 2018

lcswillems commented Nov 7, 2018 • edited Loading

maximecb commented Nov 7, 2018

lcswillems commented Nov 7, 2018

maximecb commented Nov 7, 2018

lcswillems commented Nov 7, 2018

maximecb commented Nov 8, 2018

lcswillems commented Nov 8, 2018 • edited Loading

maximecb commented Nov 8, 2018

lcswillems commented Nov 8, 2018 • edited Loading

lcswillems commented Nov 8, 2018

maximecb commented Nov 8, 2018

lcswillems commented Nov 13, 2018

maximecb commented Nov 13, 2018

lcswillems commented Nov 13, 2018 • edited Loading

maximecb commented Nov 13, 2018

lcswillems commented Nov 13, 2018

maximecb commented Dec 20, 2018

maximecb commented Dec 20, 2018

maximecb commented Dec 21, 2018

lcswillems commented Dec 21, 2018 • edited Loading

lcswillems commented Apr 7, 2019

maximecb commented Apr 7, 2019 • edited Loading

lcswillems commented Apr 8, 2019 • edited Loading

maximecb commented Apr 8, 2019

lcswillems commented Apr 8, 2019

maximecb commented Apr 9, 2019

lcswillems commented Apr 9, 2019

maximecb commented Apr 9, 2019

lcswillems commented Apr 9, 2019

maximecb commented Apr 9, 2019

maximecb commented Nov 4, 2018 •

edited

Loading

lcswillems commented Nov 6, 2018 •

edited

Loading

lcswillems commented Nov 7, 2018 •

edited

Loading

lcswillems commented Nov 8, 2018 •

edited

Loading

lcswillems commented Nov 8, 2018 •

edited

Loading

lcswillems commented Nov 13, 2018 •

edited

Loading

lcswillems commented Dec 21, 2018 •

edited

Loading

maximecb commented Apr 7, 2019 •

edited

Loading

lcswillems commented Apr 8, 2019 •

edited

Loading