3D Gaussian Splatting (3DGS) has demonstrated remarkable effectiveness for novel view synthesis (NVS). However, the 3DGS model tends to overfit when trained with sparse posed views, limiting its generalization capacity for broader pose variations. In this paper, we alleviate the overfitting problem by introducing a self-ensembling Gaussian Splatting (SE-GS) approach. We present two Gaussian Splatting models named the Σ-model and the Δ-model. The Σ-model serves as the primary model that generates novel-view images during inference. At the training stage, the Σ-model is guided away from specific local optima by an uncertainty-aware perturbing strategy. We dynamically perturb the Δ-model based on the uncertainties of novel-view renderings across different training steps, resulting in diverse temporal models sampled from the Gaussian parameter space without additional training costs. The geometry of the Σ-model is regularized by penalizing discrepancies between the Σ-model and the temporal samples. Therefore, our SE-GS conducts an effective and efficient regularization across a large number of Gaussian Splatting models, resulting in a robust ensemble, the Σ-model. Experimental results on the LLFF, Mip-NeRF360, DTU, and MVImgNet datasets show that our approach improves NVS quality with few-shot training views, outperforming existing state-of-the-art methods.
3D Gaussian Splatting(3DGS)在新视图合成(NVS)中表现出了显著的效果。然而,3DGS模型在使用稀疏姿态视图训练时容易出现过拟合,限制了其对更广泛姿态变化的泛化能力。本文通过引入一种自集成的高斯散点方法(Self-Ensembling Gaussian Splatting,SE-GS)来缓解过拟合问题。我们提出了两个高斯散点模型,分别命名为Σ-模型和Δ-模型。Σ-模型作为主要模型,用于推理阶段生成新视图图像。在训练阶段,通过一种不确定性感知扰动策略将Σ-模型引导离开特定的局部最优解。我们基于不同训练步骤中新视图渲染的不确定性对Δ-模型进行动态扰动,从而在无需额外训练成本的情况下,从高斯参数空间中采样出多样的时间模型。通过惩罚Σ-模型与这些时间样本之间的几何差异,对Σ-模型进行正则化。因此,SE-GS在大量高斯散点模型上实现了高效而有效的正则化,最终形成一个稳健的集成模型,即Σ-模型。实验结果表明,在LLFF、Mip-NeRF360、DTU和MVImgNet数据集上,我们的方法在少样本训练视图下提升了NVS质量,超越了现有的最先进方法。