We leverage 3D Gaussian Splatting (3DGS) as a scene representation and propose a novel test-time camera pose refinement framework, GSLoc. This framework enhances the localization accuracy of state-of-the-art absolute pose regression and scene coordinate regression methods. The 3DGS model renders high-quality synthetic images and depth maps to facilitate the establishment of 2D-3D correspondences. GSLoc obviates the need for training feature extractors or descriptors by operating directly on RGB images, utilizing the 3D vision foundation model, MASt3R, for precise 2D matching. To improve the robustness of our model in challenging outdoor environments, we incorporate an exposure-adaptive module within the 3DGS framework. Consequently, GSLoc enables efficient pose refinement given a single RGB query and a coarse initial pose estimation. Our proposed approach surpasses leading NeRF-based optimization methods in both accuracy and runtime across indoor and outdoor visual localization benchmarks, achieving state-of-the-art accuracy on two indoor datasets.
我们利用三维高斯喷涂(3DGS)作为场景表示,提出了一种新颖的测试时相机姿态优化框架,称为 GSLoc。该框架提升了最先进的绝对姿态回归和场景坐标回归方法的定位精度。3DGS 模型通过渲染高质量的合成图像和深度图,促进了二维与三维之间的对应关系建立。GSLoc 无需训练特征提取器或描述符,直接在 RGB 图像上操作,并利用三维视觉基础模型 MASt3R 进行精确的二维匹配。为了提高模型在复杂室外环境中的鲁棒性,我们在 3DGS 框架中引入了一个曝光自适应模块。因此,GSLoc 能够在给定单个 RGB 查询图像和粗略初始姿态估计的情况下高效地进行姿态优化。我们的方法在室内和室外视觉定位基准上,在精度和运行时间上均优于领先的基于 NeRF 的优化方法,并在两个室内数据集上实现了最先进的精度。