Recent advances in Gaussian Splatting have significantly advanced the field, achieving both panoptic and interactive segmentation of 3D scenes. However, existing methodologies often overlook the critical need for reconstructing specified targets with complex structures from sparse views. To address this issue, we introduce TSGaussian, a novel framework that combines semantic constraints with depth priors to avoid geometry degradation in challenging novel view synthesis tasks. Our approach prioritizes computational resources on designated targets while minimizing background allocation. Bounding boxes from YOLOv9 serve as prompts for Segment Anything Model to generate 2D mask predictions, ensuring semantic accuracy and cost efficiency. TSGaussian effectively clusters 3D gaussians by introducing a compact identity encoding for each Gaussian ellipsoid and incorporating 3D spatial consistency regularization. Leveraging these modules, we propose a pruning strategy to effectively reduce redundancy in 3D gaussians. Extensive experiments demonstrate that TSGaussian outperforms state-of-the-art methods on three standard datasets and a new challenging dataset we collected, achieving superior results in novel view synthesis of specific objects.
高斯点云技术的最新进展显著推动了三维场景的全景和交互式分割。然而,现有方法往往忽视了从稀疏视角重建复杂结构指定目标的关键需求。为了解决这一问题,我们提出了 TSGaussian,一种结合语义约束和深度先验的新框架,用于在具有挑战性的视图合成任务中避免几何退化。我们的方法优先将计算资源分配到指定目标上,同时最小化对背景的资源分配。 TSGaussian 使用来自 YOLOv9 的边界框作为提示,通过 Segment Anything Model 生成 2D 掩码预测,从而在保证语义准确性的同时提高成本效率。通过引入每个高斯椭球的紧凑身份编码和 3D 空间一致性正则化,TSGaussian 实现了对三维高斯点云的有效聚类。基于这些模块,我们设计了一种修剪策略,有效减少三维高斯的冗余。 大量实验表明,TSGaussian 在三个标准数据集以及我们新收集的一个具有挑战性的数据集上均优于现有最先进方法,在特定目标的新视图合成中取得了卓越的效果。