Digitizing 3D static scenes and 4D dynamic events from multi-view images has long been a challenge in computer vision and graphics. Recently, 3D Gaussian Splatting (3DGS) has emerged as a practical and scalable reconstruction method, gaining popularity due to its impressive reconstruction quality, real-time rendering capabilities, and compatibility with widely used visualization tools. However, the method requires a substantial number of input views to achieve high-quality scene reconstruction, introducing a significant practical bottleneck. This challenge is especially severe in capturing dynamic scenes, where deploying an extensive camera array can be prohibitively costly. In this work, we identify the lack of spatial autocorrelation of splat features as one of the factors contributing to the suboptimal performance of the 3DGS technique in sparse reconstruction settings. To address the issue, we propose an optimization strategy that effectively regularizes splat features by modeling them as the outputs of a corresponding implicit neural field. This results in a consistent enhancement of reconstruction quality across various scenarios. Our approach effectively handles static and dynamic cases, as demonstrated by extensive testing across different setups and scene complexities.
从多视角图像中数字化3D静态场景和4D动态事件一直是计算机视觉和图形学中的一大挑战。近年来,3D高斯投影(3D Gaussian Splatting, 3DGS)作为一种实用且可扩展的重建方法,凭借其卓越的重建质量、实时渲染能力以及与广泛使用的可视化工具兼容性,逐渐受到关注。然而,该方法需要大量的输入视角才能实现高质量的场景重建,这成为一个显著的实际瓶颈。这个挑战在捕捉动态场景时尤为严重,因为部署大规模的摄像机阵列成本高昂。 在本研究中,我们确定了高斯点特征缺乏空间自相关性是3DGS技术在稀疏重建场景中表现不佳的原因之一。为了解决这一问题,我们提出了一种优化策略,通过将高斯点特征建模为相应隐式神经场的输出,有效地对其进行正则化处理。这种方法在各种场景下持续提升了重建质量。通过广泛的测试,我们的方法在处理静态和动态场景方面都表现出色,并在不同的设置和场景复杂度下取得了显著的效果。