3D Gaussian Splatting (3DGS) is increasingly popular for 3D reconstruction due to its superior visual quality and rendering speed. However, 3DGS training currently occurs on a single GPU, limiting its ability to handle high-resolution and large-scale 3D reconstruction tasks due to memory constraints. We introduce Grendel, a distributed system designed to partition 3DGS parameters and parallelize computation across multiple GPUs. As each Gaussian affects a small, dynamic subset of rendered pixels, Grendel employs sparse all-to-all communication to transfer the necessary Gaussians to pixel partitions and performs dynamic load balancing. Unlike existing 3DGS systems that train using one camera view image at a time, Grendel supports batched training with multiple views. We explore various optimization hyperparameter scaling strategies and find that a simple sqrt(batch size) scaling rule is highly effective. Evaluations using large-scale, high-resolution scenes show that Grendel enhances rendering quality by scaling up 3DGS parameters across multiple GPUs. On the Rubble dataset, we achieve a test PSNR of 27.28 by distributing 40.4 million Gaussians across 16 GPUs, compared to a PSNR of 26.28 using 11.2 million Gaussians on a single GPU.
3D高斯散射(3DGS)由于其卓越的视觉质量和渲染速度,越来越受到3D重建的青睐。然而,目前3DGS的训练仅在单个GPU上进行,由于内存限制,这限制了其处理高分辨率和大规模3D重建任务的能力。我们引入了Grendel,这是一个分布式系统,旨在将3DGS参数进行分区并跨多个GPU并行计算。由于每个高斯只影响一小部分动态变化的渲染像素,Grendel采用稀疏全互联通信来传输必要的高斯到像素分区,并执行动态负载平衡。与现有的3DGS系统不同,这些系统一次只训练一个相机视图图像,Grendel支持使用多个视图的批量训练。我们探索了各种优化超参数缩放策略,并发现简单的sqrt(批量大小)缩放规则非常有效。在使用大规模高分辨率场景的评估中显示,Grendel通过在多个GPU上扩展3DGS参数,提高了渲染质量。在Rubble数据集上,我们通过在16个GPU上分布4040万个高斯达到了27.28的测试PSNR,相比之下,在单个GPU上使用1120万个高斯的PSNR为26.28。