SGS-SLAM: Semantic Gaussian Splatting For Neural Dense SLAM

ECCV 2024

Mingrui Li*, Shuhong Liu*, Heng Zhou, Guohao Zhu, Na Cheng, Tianchen Deng, Hongyu Wang

(*equal contribution)

Paper | Video | Replica Dataset

Abstract

We present SGS-SLAM, the first semantic visual SLAM system based on 3D Gaussian Splatting. It incorporates appearance, geometry, and semantic features through multi-channel optimization, addressing the oversmoothing limitations of neural implicit SLAM systems in high-quality rendering, scene understanding, and object-level geometry. We introduce a unique semantic feature loss that effectively compensates for the shortcomings of traditional depth and color losses in object optimization. Through a semantic-guided keyframe selection strategy, we prevent erroneous reconstructions caused by cumulative errors. Extensive experiments demonstrate that SGS-SLAM delivers state-of-the-art performance in camera pose estimation, map reconstruction, precise semantic segmentation, and object-level geometric accuracy, while ensuring real-time rendering capabilities.

Installation

conda create -n sgs-slam python=3.9
conda activate sgs-slam
conda install -c "nvidia/label/cuda-11.8.0" cuda-toolkit
conda install pytorch==2.0.1 torchvision==0.15.2 torchaudio==2.0.2 cudatoolkit=11.8 pytorch-cuda=11.8 -c pytorch -c nvidia
pip install -r requirements.txt

Download Dataset

DATAROOT is ./data by default. Please change the input_folder path in the scene-specific config files if datasets are stored elsewhere.

Replica

You can download the Replica dataset with ground-truth semantic masks from this link. The original Replica scenes, created by the Meta research team, are accessible through their official repository. By accessing the dataset via the provided link, you consent to the terms of the license. The ground-truth semantic masks were generated by us refering to the preprocessing procedure of Semantic-NeRF. Additionally, the camera trajectories in the dataset were captured using iMAP.

ScanNet

Please follow the data downloading procedure on the ScanNet website, and extract color/depth/semantic frames from the .sens file using the following preprocessing step:

python preprocess/scannet/run.py --input_folder [input path] --output_folder [output path] --export_depth_images --export_color_images --export_poses --export_intrinsics --export_seg --label_map_file preprocess/scannet/scannetv2-labels.combined.tsv

[Directory structure of ScanNet (click to expand)]

  DATAROOT
  └── scannet
        └── scene0000_00
            └── frames
                ├── color
                │   ├── 0.jpg
                │   ├── 1.jpg
                │   ├── ...
                │   └── ...
                ├── depth
                │   ├── 0.png
                │   ├── 1.png
                │   ├── ...
                │   └── ...
                ├── intrinsic
                └── pose
                    ├── 0.txt
                    ├── 1.txt
                    ├── ...
                    └── ...

We use the following sequences following the previous studies:

scene0000_00
scene0059_00
scene0106_00
scene0181_00
scene0207_00

ScanNet++

Please follow the data downloading and image undistortion procedure on the ScanNet++ website.

We use the following sequences following SplaTAM:

8b5caf3398
b20a261fdf

To extract the ground-truth semantic masks, please follow the guidelines on its official repo

Usage

We use the Replica dataset as an example. Similar approaches apply to other datasets as well.

Run the slam system:

python scripts/slam.py configs/replica/slam.py

Run the post optimization after the slam system:

python scripts/post_slam_opt.py configs/replica/post_slam_opt.py

Visualize the reconstruction in an online mannar:

python viz_scripts/online_recon.py configs/replica/slam.py

Visualize the final reconstructed scenes and manipulate the scene:

python viz_scripts/tk_recon.py configs/replica/slam.py

Saving and Visualization

By default, the system stores the reconstructed scenes in .npz format, which includes both appearance and semantic features. Additionally, we save the final RGB and semantic maps in .ply format for easier visualization. You can view the scenes using any 3DGS viewer, such as SuperSplat. For interactive rendering, as illustrated above, we adopt the Open3D viewer from SplaTAM.

Logging

We use weights and biases for the logging. To enable this, set the wandb flag to True in the configuration file and specify the wandb_folder path. Make sure to adjust the entity configuration to match your account. Each scene is associated with a config folder where you must define the input_folder and output paths. Setting wandb=False to disable the online logging.

Acknowledgement

Our work is based on SplaTAM, and by using or modifying this work further, you agree to adhere to their terms of usage and include the license file. We extend our sincere gratitude for their outstanding contributions. We would also like to thank the authors of the following repositories for making their code available as open-source:

Citation

If you find our work useful, please kindly cite us:

@article{li2024sgs,
  title={Sgs-slam: Semantic gaussian splatting for neural dense slam},
  author={Li, Mingrui and Liu, Shuhong and Zhou, Heng and Zhu, Guohao and Cheng, Na and Deng, Tianchen and Wang, Hongyu},
  journal={arXiv preprint arXiv:2402.03246},
  year={2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
assets		assets
configs		configs
datasets		datasets
preprocess/scannet		preprocess/scannet
scripts		scripts
utils		utils
viz_scripts		viz_scripts
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE		LICENSE
README.md		README.md
environment.yml		environment.yml
requirements.txt		requirements.txt
seg_demo.ipynb		seg_demo.ipynb
venv_requirements.txt		venv_requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SGS-SLAM: Semantic Gaussian Splatting For Neural Dense SLAM

ECCV 2024

Paper | Video | Replica Dataset

Table of Contents

Abstract

Installation

Download Dataset

Replica

ScanNet

ScanNet++

Usage

Saving and Visualization

Logging

Acknowledgement

Citation

About

Releases

Packages

Languages

License

ShuhongLL/SGS-SLAM

Folders and files

Latest commit

History

Repository files navigation

SGS-SLAM: Semantic Gaussian Splatting For Neural Dense SLAM

ECCV 2024

Paper | Video | Replica Dataset

Table of Contents

Abstract

Installation

Download Dataset

Replica

ScanNet

ScanNet++

Usage

Saving and Visualization

Logging

Acknowledgement

Citation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages