Skip to content

Latest commit

 

History

History
5 lines (3 loc) · 2.81 KB

2412.16346.md

File metadata and controls

5 lines (3 loc) · 2.81 KB

SOUS VIDE: Cooking Visual Drone Navigation Policies in a Gaussian Splatting Vacuum

We propose a new simulator, training approach, and policy architecture, collectively called SOUS VIDE, for end-to-end visual drone navigation. Our trained policies exhibit zero-shot sim-to-real transfer with robust real-world performance using only on-board perception and computation. Our simulator, called FiGS, couples a computationally simple drone dynamics model with a high visual fidelity Gaussian Splatting scene reconstruction. FiGS can quickly simulate drone flights producing photorealistic images at up to 130 fps. We use FiGS to collect 100k-300k observation-action pairs from an expert MPC with privileged state and dynamics information, randomized over dynamics parameters and spatial disturbances. We then distill this expert MPC into an end-to-end visuomotor policy with a lightweight neural architecture, called SV-Net. SV-Net processes color image, optical flow and IMU data streams into low-level body rate and thrust commands at 20Hz onboard a drone. Crucially, SV-Net includes a Rapid Motor Adaptation (RMA) module that adapts at runtime to variations in drone dynamics. In a campaign of 105 hardware experiments, we show SOUS VIDE policies to be robust to 30% mass variations, 40 m/s wind gusts, 60% changes in ambient brightness, shifting or removing objects from the scene, and people moving aggressively through the drone's visual field. Code, data, and experiment videos can be found on our project page: this https URL.

我们提出了一种新的模拟器、训练方法和策略架构,统称为SOUS VIDE,用于端到端的视觉无人机导航。我们的训练策略展现了零样本仿真到现实的迁移能力,并在仅依赖车载感知和计算的情况下,实现了在真实世界中的稳健性能。我们的模拟器FiGS结合了计算简单的无人机动力学模型和高视觉保真度的高斯点云场景重建。FiGS能够快速模拟无人机飞行,生成高达130帧每秒的逼真图像。我们使用FiGS从具有特权状态和动力学信息的专家MPC中收集了10万至30万对观测-动作数据,这些数据在动力学参数和空间干扰方面进行了随机化。然后,我们将这些专家MPC蒸馏成一个端到端的视觉运动策略,采用一种轻量级的神经架构,称为SV-Net。SV-Net处理彩色图像、光流和IMU数据流,并以20Hz的频率在无人机上生成低级的体速率和推力指令。关键地,SV-Net包括一个快速电机适应(RMA)模块,能够在运行时适应无人机动力学的变化。在105次硬件实验中,我们展示了SOUS VIDE策略在面对30%的质量变化、40 m/s的风速突变、60%的环境亮度变化、场景中物体的移动或移除以及人员在无人机视野中积极移动等情况下,依然保持了高度的鲁棒性。