Compared with previous 3D reconstruction methods like Nerf, recent Generalizable 3D Gaussian Splatting (G-3DGS) methods demonstrate impressive efficiency even in the sparse-view setting. However, the promising reconstruction performance of existing G-3DGS methods relies heavily on accurate multi-view feature matching, which is quite challenging. Especially for the scenes that have many non-overlapping areas between various views and contain numerous similar regions, the matching performance of existing methods is poor and the reconstruction precision is limited. To address this problem, we develop a strategy that utilizes a predicted depth confidence map to guide accurate local feature matching. In addition, we propose to utilize the knowledge of existing monocular depth estimation models as prior to boost the depth estimation precision in non-overlapping areas between views. Combining the proposed strategies, we present a novel G-3DGS method named TranSplat, which obtains the best performance on both the RealEstate10K and ACID benchmarks while maintaining competitive speed and presenting strong cross-dataset generalization ability.
与之前的 3D 重建方法如 Nerf 相比,最近的通用 3D 高斯点云(G-3DGS)方法在稀疏视图设置下表现出令人印象深刻的效率。然而,现有 G-3DGS 方法的良好重建性能在很大程度上依赖于准确的多视图特征匹配,这非常具有挑战性。特别是对于那些各视图之间有许多非重叠区域并且包含大量相似区域的场景,现有方法的匹配性能较差,重建精度有限。为了解决这个问题,我们开发了一种利用预测深度置信度图来引导准确局部特征匹配的策略。此外,我们建议利用现有单目深度估计模型的知识作为先验,以提升视图间非重叠区域的深度估计精度。结合这些策略,我们提出了一种新型 G-3DGS 方法,命名为 TranSplat,它在 RealEstate10K 和 ACID 基准测试中均获得了最佳性能,同时保持了竞争的速度,并展示了强大的跨数据集泛化能力。