Coupled Iterative Refinement for 6D Multi-Object Pose Estimation

This repository contains the source code for our paper:

Coupled Iterative Refinement for 6D Multi-Object Pose Estimation
CVPR 2022
Lahav Lipson, Zachary Teed, Ankit Goyal and Jia Deng
Poster / Video

@inproceedings{lipson2022coupled,
  title={Coupled Iterative Refinement for 6D Multi-Object Pose Estimation},
  author={Lipson, Lahav and Teed, Zachary and Goyal, Ankit and Deng, Jia},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
  year={2022},
}

Getting Started

Clone the repo using the --recursive flag

git clone --recursive https://github.com/princeton-vl/coupled-iterative-refinement.git
cd coupled-iterative-refinement

Create a new anaconda environment using the provided .yaml file (requires an NVidia GPU)

conda env create -f environment.yaml
conda activate cir

Demos

To download a sample of YCB-V, T-LESS and LM-O (513Mb), run

chmod ug+x download_sample.sh && ./download_sample.sh

Download pretrained model weights (2.4Gb) by running

chmod ug+x download_model_weights.sh &&  ./download_model_weights.sh

or downloaded from google drive

To demo a trained model on any the bop datasets, run

python demo.py --obj_models ycbv --scene_dir sample_ycbv --load_weights model_weights/refiner/ycbv_rgbd.pth

Replace ycbv with tless or lmo in the above arguments to use a different dataset. sample_ycbv can be downloaded using ./download_sample.sh or replaced with the path to a BOP scene directory. The refiner forward pass take ~1.12s using the default settings. The demo saves images of the pose refinement to qualitative_output/, which can be flipped through using any image viewer.

To demo our RGB-only method, use --rgb_only and replace _rgbd.pth with _rgb.pth

Full Datasets (for training and/or evaluation)

To evaluate/train Coupled Iterative Refinement, you will need to download at least one of the core BOP datasets:

LM (Linemod)
LM-O (Linemod-Occluded)
T-LESS
MVTec ITODD
HB (HomebrewedDB)
YCB-V (YCB-Video)
IC-BIN (Doumanoglou et al.)
TUD-L (TUD Light)

To download and unzip any of the core BOP datasets, we provide a script:

python additional_scripts/download_datasets.py --bop_dataset ycbv --models ycbv

To download the Pascal VOC dataset used for background data-augmentation, run

python additional_scripts/download_datasets.py --augmentation_textures # Downloads VOC2012

By default bop.py will search for the datasets in these locations.

├── local_data
    ├── VOCdevkit
        ├── VOC2012
    ├── bop_datasets
        ├── hb
        ├── icbin
        ├── itodd
        ├── lm
        ├── lmo
        ├── tless
        ├── tudl
        ├── ycbv

Training

Our method is trained separately for each of the BOP datasets. To train the YCB-V refinement model, run

python train.py --dataset ycbv --batch_size 12 --num_inner_loops 10 --num_solver_steps 3

To train a refinement model on other BOP datasets, replace ycbv with one of the following: tless, lmo, hb, tudl, icbin, itodd

The above command runs on two NVIDIA RTX 3090s (48Gb GPU Memory). To use less memory, decrease batch_size, num_inner_loops or num_solver_steps.

Evaluation

To run our end-to-end pipeline to reproduce our results, run

python test.py --save_dir my_evaluation --dataset ycbv --load_weights model_weights/refiner/ycbv_rgbd.pth

This will deposit a .tar file in my_evaluation/. The evaluation is slow, but can be run in parallel for different portions of the test set using --start_index and --num_images, with each run depositing a separate .tar file in my_evaluation/. If using a slurm array job to distribute testing, these arguments will be automatically set.

To evaluate the predictions, run

python additional_scripts/run_bop20_eval.py --dataset ycbv --tar_dir my_evaluation

You can also evaluate a refiner model using random perturbations from the ground-truth pose (i.e. the validation setting) by using --evaluate

python train.py --dataset ycbv --evaluate --batch_size 2 --load_weights model_weights/refiner/ycbv_rgbd.pth

RGB-Only Variant

To use the RGB-Only models, use --rgb_only when running python train.py, python test.py or python demo.py

We provide trained RGB-only refiners for YCB-V, T-LESS and Linemod (Occluded) which can be loaded using --load_weights

Trained Refiner Models

BOP Submission	Input type	Training split	model
YCB-V (YCB-Video)	RGB-D	train_real x 1 + train_synt x 3	ycbv_rgbd.pth
T-LESS	RGB-D	train_pbr x 4 + train_primesense x 1	tless_rgbd.pth
LM-O (Linemod-Occluded)	RGB-D	train_pbr (Linemod)	lmo_rgbd.pth
HB (HomebrewedDB)	RGB-D	train_pbr	hb_rgbd.pth
IC-BIN (Doumanoglou et al.)	RGB-D	train_pbr	icbin_rgbd.pth
ITODD (MVTec ITODD)	RGB-D	train_pbr	itodd_rgbd.pth
TUD-L (TUD Light)	RGB-D	train_pbr x 3 + train_real x 4	tudl_rgbd.pth

YCB-V (YCB-Video)	RGB	train_real x 1 + train_synt x 3	ycbv_rgb.pth
T-LESS	RGB	train_pbr x 4 + train_primesense x 1	tless_rgb.pth
LM-O (Linemod-Occluded)	RGB	train_pbr (Linemod)	lmo_rgb.pth

The code for training the detectors and coarse models is available in the Cosypose Github. The training procedure is outlined at the bottom of Appendix A in their paper.

(Optional) Faster Implementation

We provide a faster CUDA implementation of the correlation sampler. The code will default to this implementation if it is compiled (requires an NVidia GPU to compile, takes ~3 min).

cd pose_models/modules && pip install . && cd - # Compiles corr_sampler

Acknowledgement

This repository makes extensive use of code from the Cosypose Github repository. We thank the authors for open sourcing their implementation.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Coupled Iterative Refinement for 6D Multi-Object Pose Estimation

Getting Started

Demos

Full Datasets (for training and/or evaluation)

Training

Evaluation

RGB-Only Variant

Trained Refiner Models

(Optional) Faster Implementation

Acknowledgement

Files

README.md

Latest commit

History

README.md

File metadata and controls

Coupled Iterative Refinement for 6D Multi-Object Pose Estimation

Getting Started

Demos

Full Datasets (for training and/or evaluation)

Training

Evaluation

RGB-Only Variant

Trained Refiner Models

(Optional) Faster Implementation

Acknowledgement