Skip to content

Latest commit

 

History

History
5 lines (3 loc) · 2.15 KB

2403.08321.md

File metadata and controls

5 lines (3 loc) · 2.15 KB

ManiGaussian: Dynamic Gaussian Splatting for Multi-task Robotic Manipulation

Performing language-conditioned robotic manipulation tasks in unstructured environments is highly demanded for general intelligent robots. Conventional robotic manipulation methods usually learn semantic representation of the observation for action prediction, which ignores the scene-level spatiotemporal dynamics for human goal completion. In this paper, we propose a dynamic Gaussian Splatting method named ManiGaussian for multi-task robotic manipulation, which mines scene dynamics via future scene reconstruction. Specifically, we first formulate the dynamic Gaussian Splatting framework that infers the semantics propagation in the Gaussian embedding space, where the semantic representation is leveraged to predict the optimal robot action. Then, we build a Gaussian world model to parameterize the distribution in our dynamic Gaussian Splatting framework, which provides informative supervision in the interactive environment via future scene reconstruction. We evaluate our ManiGaussian on 10 RLBench tasks with 166 variations, and the results demonstrate our framework can outperform the state-of-the-art methods by 13.1% in average success rate.

在非结构化环境中执行语言条件下的机器人操纵任务对于通用智能机器人来说需求极高。传统的机器人操纵方法通常学习观察的语义表示以预测动作,这忽略了完成人类目标的场景级时空动态。在本文中,我们提出了一种名为ManiGaussian的动态高斯溅射方法,用于多任务机器人操纵,该方法通过未来场景重建来挖掘场景动态。具体来说,我们首先构建了动态高斯溅射框架,该框架推断高斯嵌入空间中的语义传播,其中语义表示被利用来预测最优的机器人动作。然后,我们构建了一个高斯世界模型来参数化我们的动态高斯溅射框架中的分布,该模型通过未来场景重建在交互环境中提供信息丰富的监督。我们在10个RLBench任务上评估了我们的ManiGaussian,包含166种变化,结果表明我们的框架可以平均成功率比最先进方法高出13.1%。