Multi-view image compression is vital for 3D-related applications. To effectively model correlations between views, existing methods typically predict disparity between two views on a 2D plane, which works well for small disparities, such as in stereo images, but struggles with larger disparities caused by significant view changes. To address this, we propose a novel approach: learning-based multi-view image coding with 3D Gaussian geometric priors (3D-GP-LMVIC). Our method leverages 3D Gaussian Splatting to derive geometric priors of the 3D scene, enabling more accurate disparity estimation across views within the compression model. Additionally, we introduce a depth map compression model to reduce redundancy in geometric information between views. A multi-view sequence ordering method is also proposed to enhance correlations between adjacent views. Experimental results demonstrate that 3D-GP-LMVIC surpasses both traditional and learning-based methods in performance, while maintaining fast encoding and decoding speed.
多视图图像压缩在与3D相关的应用中至关重要。为了有效建模视图之间的相关性,现有方法通常在二维平面上预测两个视图之间的视差,这在小视差场景(如立体图像)中效果较好,但在由于视图大幅变化导致的较大视差情况下表现不佳。为了解决这个问题,我们提出了一种新方法:基于学习的带有3D高斯几何先验的多视图图像编码(3D-GP-LMVIC)。该方法利用3D高斯分裂技术获取3D场景的几何先验,从而在压缩模型中实现更准确的视差估计。此外,我们引入了深度图压缩模型,减少视图间几何信息的冗余。我们还提出了一种多视图序列排序方法,以增强相邻视图之间的相关性。实验结果表明,3D-GP-LMVIC在性能上超越了传统和基于学习的方法,同时保持了快速的编码和解码速度。