Skip to content

pySLAM is a Visual SLAM pipeline in Python for monocular, stereo and RGBD cameras. It supports many modern local and global features based on Deep Learning.

License

Notifications You must be signed in to change notification settings

luigifreda/pyslam

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

pySLAM v2.2

Author: Luigi Freda

pySLAM is a python implementation of a Visual Odometry (VO) pipeline for monocular, stereo and RGBD cameras. It supports many classical and modern local features, and it offers a convenient interface for them. Moreover, it collects other common and useful VO and SLAM tools.

I released the first version of pySLAM (v1) for educational purposes, for a computer vision class I taught. I started developing it for fun as a python programming exercise, during my free time, taking inspiration from some repos available on the web.

Main Scripts:

  • main_vo.py combines the simplest VO ingredients without performing any image point triangulation or windowed bundle adjustment. At each step $k$, main_vo.py estimates the current camera pose $C_k$ with respect to the previous one $C_{k-1}$. The inter-frame pose estimation returns $[R_{k-1,k},t_{k-1,k}]$ with $\Vert t_{k-1,k} \Vert=1$. With this very basic approach, you need to use a ground truth in order to recover a correct inter-frame scale $s$ and estimate a valid trajectory by composing $C_k = C_{k-1} [R_{k-1,k}, s t_{k-1,k}]$. This script is a first start to understand the basics of inter-frame feature tracking and camera pose estimation.

  • main_slam.py adds feature tracking along multiple frames, point triangulation, keyframe management and bundle adjustment in order to estimate the camera trajectory up-to-scale and build a map. It's still a VO pipeline but it shows some basic blocks which are necessary to develop a real visual SLAM pipeline.

  • main_feature_matching.py shows how to use the basic feature tracker capabilities (feature detector + feature descriptor + feature matcher) and allows to test the different available local features. Further details here.

  • main_map_viewer.py allows to reload a saved map and visualize it.

You can use this framework as a baseline to play with local features, VO techniques and create your own (proof of concept) VO/SLAM pipeline in python. When you test it, consider that's a work in progress, a development framework written in Python, without any pretence of having state-of-the-art localization accuracy or real-time performances.

Enjoy it!

Visual Odometry

Feature Matching

SLAM

Feature matching and Visual Odometry


Install

First, clone this repo and its modules by running

$ git clone --recursive https://github.com/luigifreda/pyslam.git
$ cd pyslam 

Then, use the available specific install procedure according to your OS. The provided scripts will create a single python environment that is able to host all the supported local features!

  • Ubuntu =>
  • MacOs =>
  • Windows =>
  • Docker =>

Requirements

  • Python 3.8.10
  • OpenCV >=4.8.1 (see below)
  • PyTorch 2.3.1
  • Tensorflow 2.13.1
  • Kornia 0.7.3
  • Rerun

If you run into troubles or performance issues, check this TROUBLESHOOTING file.


Ubuntu

Follow the instructions reported here for creating a new virtual environment pyslam with venv. The procedure has been tested on Ubuntu 18.04, 20.04, 22.04 and 24.04.

If you prefer conda, run the scripts described in this other file.


MacOS

Follow the instructions in this file. The reported procedure was tested under Sonoma 14.5 and Xcode 15.4.


Docker

If you prefer docker or you have an OS that is not supported yet, you can use rosdocker:

  • with its custom pyslam / pyslam_cuda docker files and follow the instructions here.
  • with one of the suggested docker images (ubuntu*_cuda or ubuntu*), where you can build and run pyslam.

How to install non-free OpenCV modules

The provided install scripts take care of installing a recent opencv version (>=4.8) with its non-free modules enabled (see for instance install_pip3_packages.sh, which is used with venv under Ubuntu, or the install_opencv_python.sh under mac).

How to check your installed OpenCV version:
$ python3 -c "import cv2; print(cv2.__version__)"

How to check if you have non-free OpenCV module support (no errors imply success):
$ python3 -c "import cv2; detector = cv2.xfeatures2d.SURF_create()"

Troubleshooting

If you run into issues or errors during the installation process or at run-time, please, check the TROUBLESHOOTING.md file.


Usage

Once you have run the script install_all_venv.sh (follow the instructions above according to your OS), you can open a new terminal and run:

$ . pyenv-activate.sh   #  Activate pyslam python virtual environment. This is just needed once in a new terminal.
$ ./main_vo.py

This will process a default KITTI video (available in the folder videos) by using its corresponding camera calibration file (available in the folder settings), and its groundtruth (available in the same videos folder). You can stop main_vo.py by focusing on the Trajectory window and pressing the key 'Q'. Note: As explained above, the basic script main_vo.py strictly requires a ground truth.

In order to process a different dataset, you need to set the file config.yaml:

  • Select your dataset type in the section DATASET (further details in the section Datasets below for further details). This identifies a corresponding dataset section (e.g. KITTI_DATASET, TUM_DATASET, etc).
  • Select the sensor_type (mono, stereo, rgbd) in the chosen dataset section.
  • Select the camera settings file in the dataset section (further details in the section Camera Settings below).
  • The groudtruth_file accordingly (further details in the section Datasets below and check the files ground_truth.py and convert_groundtruth.py).

Similarly, you can test main_slam.py by running:

$ . pyenv-activate.sh   #  Activate pyslam python virtual environment. This is just needed once in a new terminal.
$ ./main_slam.py

This will process a default KITTI video (available in the folder videos) by using its corresponding camera calibration file (available in the folder settings). You can stop it by focusing on the opened Figure 1 window and pressing the key 'Q'. Note: Due to information loss in video compression, main_slam.py tracking may peform worse with the available KITTI videos than with the original KITTI image sequences. The available videos are intended to be used for a first quick test. Please, download and use the original KITTI image sequences as explained below.

If you just want to test the basic feature tracker capabilities (feature detector + feature descriptor + feature matcher) and get a tast of the different available local features, run

$ . pyenv-activate.sh   #  Activate pyslam python virtual environment. This is just needed once in a new terminal.
$ ./main_feature_matching.py

In any of the above scripts, you can choose any detector/descriptor among ORB, SIFT, SURF, BRISK, AKAZE, SuperPoint, etc. (see the section Supported Local Features below for further information).

Some basic test/example files are available in the subfolder test. In particular, as for feature detection/description, you may want to take a look at test/cv/test_feature_manager.py too.

Save and reload a map

When you run the script main_slam.py:

  • The current map can be saved into the file map.json by pressing the button Save on the GUI.
  • The saved map can be reloaded and visualized into the GUI by running:
$ . pyenv-activate.sh   #  Activate pyslam python virtual environment. This is just needed once in a new terminal.
$ ./main_map_viewer.py

Relocalization in a loaded map is a WIP.

Trajectory saving

Estimated trajectories can be saved in three different formats: TUM (The Open Mapping format), KITTI (KITTI Odometry format), and EuRoC (EuRoC MAV format). To enable trajectory saving, open config.yaml and search for the SAVE_TRAJECTORY: set save_trajectory: True, select your format_type (tum, kitti, euroc), and the output filename. For instance for a tum format output:

SAVE_TRAJECTORY:
  save_trajectory: True
  format_type: tum
  filename: kitti00_trajectory.txt

GUI

Some quick information about the non-trivial GUI buttons of main_slam.py:

  • Step: Enter in the Step by step mode. Press the button Step a first time to pause. Then, press it again to make the pipeline process a single new frame.
  • Save: Save the map into the file map.json. You can visualize it back by using the script /main_map_viewer.py (as explained above).
  • Draw GT: In the case a groundtruth is loaded (e.g. with KITTI, TUM, EUROC datasets), you can visualize it by pressing this button. The groundtruth trajectory will be visualized and progressively aligned to the estimated trajectory.

Supported Local Features

At present time, the following feature detectors are supported:

The following feature descriptors are supported:

You can find further information in the file feature_types.py. Some of the local features consist of a joint detector-descriptor. You can start playing with the supported local features by taking a look at test/cv/test_feature_manager.py and main_feature_matching.py.

In both the scripts main_vo.py and main_slam.py, you can create your favourite detector-descritor configuration and feed it to the function feature_tracker_factory(). Some ready-to-use configurations are already available in the file feature_tracker.configs.py

The function feature_tracker_factory() can be found in the file feature_tracker.py. Take a look at the file feature_manager.py for further details.

N.B.: you just need a single python environment to be able to work with all the supported local features!


Supported Matchers


Datasets

You can use 5 different types of datasets:

Dataset type in config.yaml
KITTI odometry data set (grayscale, 22 GB) type: KITTI_DATASET
TUM dataset type: TUM_DATASET
EUROC dataset type: EUROC_DATASET
Video file type: VIDEO_DATASET
Folder of images type: FOLDER_DATASET

KITTI Datasets

pySLAM code expects the following structure in the specified KITTI path folder (specified in the section KITTI_DATASET of the file config.yaml). :

├── sequences
    ├── 00
    ...
    ├── 21
├── poses
    ├── 00.txt
        ...
    ├── 10.txt

  1. Download the dataset (grayscale images) from http://www.cvlibs.net/datasets/kitti/eval_odometry.php and prepare the KITTI folder as specified above

  2. Select the corresponding calibration settings file (parameter [KITTI_DATASET][cam_settings] in the file config.yaml)

TUM Datasets

pySLAM code expects a file associations.txt in each TUM dataset folder (specified in the section TUM_DATASET: of the file config.yaml).

  1. Download a sequence from http://vision.in.tum.de/data/datasets/rgbd-dataset/download and uncompress it.

  2. Associate RGB images and depth images using the python script associate.py. You can generate your associations.txt file by executing:

$ python associate.py PATH_TO_SEQUENCE/rgb.txt PATH_TO_SEQUENCE/depth.txt > associations.txt
  1. Select the corresponding calibration settings file (parameter TUM_DATASET: cam_settings: in the file config.yaml)

EuRoC Dataset

  1. Download a sequence (ASL format) from http://projects.asl.ethz.ch/datasets/doku.php?id=kmavvisualinertialdatasets (check this direct link)

  2. Select the corresponding calibration settings file (parameter EUROC_DATASET: cam_settings: in the file config.yaml)


Camera Settings

The folder settings contains the camera settings files which can be used for testing the code. These are the same used in the framework ORB-SLAM2. You can easily modify one of those files for creating your own new calibration file (for your new datasets).

In order to calibrate your camera, you can use the scripts in the folder calibration. In particular:

  1. use the script grab_chessboard_images.py to collect a sequence of images where the chessboard can be detected (set the chessboard size therein, you can use the calibration pattern calib_pattern.pdf in the same folder)
  2. use the script calibrate.py to process the collected images and compute the calibration parameters (set the chessboard size therein)

For further information about the calibration process, you may want to have a look here.

If you want to use your camera, you have to:

  • calibrate it and configure WEBCAM.yaml accordingly
  • record a video (for instance, by using save_video.py in the folder calibration)
  • configure the VIDEO_DATASET section of config.yaml in order to point to your recorded video.

Comparison pySLAM vs ORB-SLAM3

For a comparison of the trajectories estimated by pySLAM and by ORB-SLAM3, see this trajectory comparison notebook.

Note that pySLAM pose estimates are saved online: At each frame, the current pose estimate is saved. On the other end, ORB-SLAM3 pose estimates are saved at the end of the dataset playback: That means each pose estimate $q$ is refined multiple times by LBA and BA over the multiple window optimizations that cover $q$.


Contributing to pySLAM

If you like pySLAM and would like to contribute to the code base, you can report bugs, leave comments and proposing new features through issues and pull requests on github. Feel free to get in touch at luigifreda(at)gmail[dot]com. Thank you!


References

Suggested books:

Suggested material:

Moreover, you may want to have a look at the OpenCV guide or tutorials.


Credits


TODOs

Many improvements and additional features are currently under development:

  • loop closure
  • relocalization
  • map saving/loading
  • modern DL matching algorithms
  • object detection and semantic segmentation
  • 3D dense reconstruction
  • unified install procedure (single branch) for all OSs
  • trajectory saving

About

pySLAM is a Visual SLAM pipeline in Python for monocular, stereo and RGBD cameras. It supports many modern local and global features based on Deep Learning.

Resources

License

Stars

Watchers

Forks

Packages

No packages published