Are there (mobile) manipulation tasks and baselines for those? #946
-
Hi all, I was wondering if OmniGibson / Behavior-1K has any manipulation tasks yet, and if there are reinforcement learning or learning from demos baselines? Moreover, I saw that there are efforts to try and GPU parallelized OmniGibson environments, I was wondering what the progress of that is and when can we expect it to be out (given its importance for RL). Thanks! |
Beta Was this translation helpful? Give feedback.
Replies: 3 comments
-
Hi, thanks for your interest! BEHAVIOR-1K consists of 1,000 tasks that are all mobile manipulation, e.g. you have to be able to navigate the scene and manipulate objects to be able to successfully do them. We do not have any baseline policies currently available but are working on this - we will be collecting demonstrations for all 1,000 tasks. Re: the GPU parallelized environments, we have introduced CPU parallelized environments recently on our development branch (og-develop) that you can try out if you wish (lets you go from ~20fps to ~300fps total). We are actively working on enabling GPU mode to fully benefit from Isaac Sim's GPU parallelism. We hope to include the full CPU/GPU parallel envs feature in our next release which will happen at some time during this summer. Btw, I was at your talk & panel re: Maniskill at the CVPR EAI workshop and really appreciate the focus on high-speed simulation, as you can see we too are pushing on that front as much as possible - in fact there's a lot of similarity between Maniskill's roadmap and our internal one. I'm excited to see the comparison between an off-the-shelf product like Isaac Sim that has advantages of more & high-quality SWE developers but disadvantages of closed & unmodifiable source and lots of Omniverse-related overhead; versus your fully DIY open-source approach in Maniskill (that we frankly do not have the CUDA skills to replicate). Let us know what your interests / use cases are for OmniGibson/BEHAVIOR-1K - I'd be happy to provide support & also I think it's good to have open channels of communication. |
Beta Was this translation helpful? Give feedback.
-
Thanks for the honesty, and agree on the open channels of comms over github! I wanted to see if there were some RL / Learning from demos methods on the data BEHAVIOR-1K comes with get a better understanding of how machine learning (focusing on manipulation) workflows will work with BEHAVIOR-1k and compare with my lab's current efforts and see what we can learn from your project. I'm aware there are tasks with success conditions but wasn't aware there was progress on collecting demonstration data for those. Another purpose of the inquiry was also for paper writing/comparison tables (to make sure I fairly talk about BEHAVIOR-1K in terms of what it has and does not have and report correct numbers). By the way, is the planned demo data human-collected or a mix of human-collected + machine-generated (e.g. heuristic type like mimic gen or via learning from demos + policy rollouts)? Thanks for the pointer for the cpu parallelization, I previously had some concerns about the speed as <60FPS sounded a bit too slow for RL (it also seems ray tracing is the only visual data mode? but I last checked this with the demo scripts a few months ago), I can take a look at the og-develop branch / quote the 300fps numbers. Regarding other use cases/interests, I will detail a few points below off the top of my head:
|
Beta Was this translation helpful? Give feedback.
-
Hey, sorry for the delayed response. Re: demo collection, we are at the same time making progress both on human demos and heuristic-based ones. We do want to have at least one human demonstration per task to prove feasibility of the tasks, but beyond that most of the demonstrations will presumably come from heuristics (skill primitives built using motion planning). Re: the parallelization, note that the 300fps number is still without rendering. Up until now we were unable to do rendering of multiple environments in parallel and thus were doing RL mainly with low-level observations. However Isaac Sim's most recent release actually supports tiled rendering of all envs at once, which we will be supporting soon. Re: the assets, agreed - this is where we have spent a lot of the time on the project and would be happy to share. The license situation is not with Omniverse but with Turbosquid - their interpretation of the law is that OmniGibson is like a video game and we're OK to ship assets with it; SAPIEN is a different videogame so it would need to repurchase the same objects. That said, we could provide you with the IDs of all the objects we used for BEHAVIOR-1K so that you could repurchase them from TurboSquid just so that you would now have a license for SAPIEN, and you could now directly use the BEHAVIOR-1K assets. Re: the roadmap, we just talked about your roadmap at our group meeting today - we plan to put ours up soon on our documentation. I will let you know. Agreed that we should talk about things we could split up so that we could cross-benefit from such efforts instead of spending our limited research workforce & $$$ on the same things all over again. Re: robot spawning, we do it based on the BDDL task definitions. A task starts with inroom(robot, kitchen) for example, then we find the kitchen floor object (the floors are segmented by room), then we do rejection sampling to find a suitable spot the size of the robot bbox, and we plop the robot there. The assumption in B1K tasks is that the robot should be able to start from any configuration and take care of navigating to the workspace. For e2e RL this indeed is not a great idea since you'll spend a lot of time navigating and little time handling the object. Tongzhou was here today giving a talk and I just spoke to him too - we could definitely benefit from some conversations around the similar roadmaps and how to avoid duplicate efforts and focusing on complementary things and reuse. I'll get our roadmap posted and get the ball rolling. |
Beta Was this translation helpful? Give feedback.
Hey, sorry for the delayed response.
Re: demo collection, we are at the same time making progress both on human demos and heuristic-based ones. We do want to have at least one human demonstration per task to prove feasibility of the tasks, but beyond that most of the demonstrations will presumably come from heuristics (skill primitives built using motion planning).
Re: the parallelization, note that the 300fps number is still without rendering. Up until now we were unable to do rendering of multiple environments in parallel and thus were doing RL mainly with low-level observations. However Isaac Sim's most recent release actually supports tiled rendering of all envs at once, which we will…