The modeling and manipulation of 3D scenes captured from the real world are pivotal in various applications, attracting growing research interest. While previous works on editing have achieved interesting results through manipulating 3D meshes, they often require accurately reconstructed meshes to perform editing, which limits their application in 3D content generation. To address this gap, we introduce a novel single-image-driven 3D scene editing approach based on 3D Gaussian Splatting, enabling intuitive manipulation via directly editing the content on a 2D image plane. Our method learns to optimize the 3D Gaussians to align with an edited version of the image rendered from a user-specified viewpoint of the original scene. To capture long-range object deformation, we introduce positional loss into the optimization process of 3D Gaussian Splatting and enable gradient propagation through reparameterization. To handle occluded 3D Gaussians when rendering from the specified viewpoint, we build an anchor-based structure and employ a coarse-to-fine optimization strategy capable of handling long-range deformation while maintaining structural stability. Furthermore, we design a novel masking strategy to adaptively identify non-rigid deformation regions for fine-scale modeling. Extensive experiments show the effectiveness of our method in handling geometric details, long-range, and non-rigid deformation, demonstrating superior editing flexibility and quality compared to previous approaches.
在众多应用中,真实世界捕获的3D场景的建模和操控至关重要,吸引了越来越多的研究兴趣。尽管之前的研究通过操控3D网格实现了有趣的编辑效果,但它们通常需要精确重建的网格来进行编辑,这限制了其在3D内容生成中的应用。为了解决这一问题,我们引入了一种基于3D高斯点绘(3D Gaussian Splatting)的新型单图像驱动的3D场景编辑方法,使用户能够通过直接在2D图像平面上编辑内容来实现直观的操作。我们的方法通过学习优化3D高斯点绘,使其与用户指定视角下渲染的图像的编辑版本对齐。 为了捕捉远程物体的变形,我们在3D高斯点绘的优化过程中引入了位置损失,并通过重新参数化实现梯度传播。为了解决从指定视角渲染时被遮挡的3D高斯点绘问题,我们构建了一个基于锚点的结构,并采用了粗到细的优化策略,能够处理远程变形,同时保持结构的稳定性。此外,我们设计了一种新颖的掩码策略,自适应地识别非刚性变形区域,以进行细致的建模。广泛的实验表明,我们的方法在处理几何细节、远程变形和非刚性变形方面具有显著的效果,相较于以往的方法展现出更强的编辑灵活性和质量。