PTQ4SAM: Post-Training Quantization for Segment Anything (CVPR 2024)

Chengtao Lv*, Hong Chen*, Jinyang Guo📧, Yifu Ding, Xianglong Liu

(* denotes equal contribution, 📧 denotes corresponding author.)

Overview

Segment Anything Model (SAM) has achieved impressive performance in many computer vision tasks. However, as a large-scale model, the immense memory and computation costs hinder its practical deployment. In this paper, we propose a post-training quantization (PTQ) framework for Segment Anything Model, namely PTQ4SAM. First, we investigate the inherent bottleneck of SAM quantization attributed to the bimodal distribution in post-Key-Linear activations. We analyze its characteristics from both per-tensor and per-channel perspectives, and propose a Bimodal Integration strategy, which utilizes a mathematically equivalent sign operation to transform the bimodal distribution into a relatively easy-quantized normal distribution offline. Second, SAM encompasses diverse attention mechanisms (i.e., self-attention and two-way cross-attention), resulting in substantial variations in the post-Softmax distributions. Therefore, we introduce an Adaptive Granularity Quantization for Softmax through searching the optimal power-of-two base, which is hardware-friendly.

Create Environment

🍺🍺🍺 You can refer the environment.sh in the root directory or install step by step.

Install PyTorch

conda create -n ptq4sam python=3.7 -y
pip install torch torchvision

Install MMCV

pip install -U openmim
mim install "mmcv-full<2.0.0"

Install other requirements

pip install -r requirements.txt

Compile CUDA operators

cd projects/instance_segment_anything/ops
python setup.py build install
cd ../../..

Install mmdet

cd mmdetection/
python3 setup.py build develop
cd ..

Prepare Dataset and Models

Download the official COCO dataset, put them into the corresponding folders of datasets/ and recollect them as the following form:

├── data
│   ├── coco
│   │   ├── annotations
│   │   ├── train2017
│   │   ├── val2017
│   │   ├── test2017

Download the pretrain weights (SAM and detectors), put them into the corresponding folders of ckpt/:

sam_b: ViT-B SAM
sam_l: ViT-L SAM
sam_h: ViT-H SAM
faster rcnn: R-50-FPN Faster R-CNN
yolox: YOLOX-l
detr: H-Deformable-DETR
dino: DINO

Usage

To perform quantization on models, specify the model configuration and quantization configuration. For example, to perform W6A6 quantization for SAM-B with a YOLO detector, use the following command:

python ptq4sam/solver/test_quant.py \
--config ./projects/configs/yolox/yolo_l-sam-vit-l.py \
--q_config exp/config66.yaml --quant-encoder

yolo_l-sam-vit-l.py: configuration file for the SAM-B model with YOLO detector.
config66.yaml: configuration file for W6A6 quantization.
quant-encoder: quant the encoder of SAM.

We recommend using a GPU with more than 40GB for experiments. If you want to visualize the prediction results, you can achieve this by specifying --show-dir. Bimodal distributions mainly occur in the mask decoder of SAM-B and SAM-L.

Reference

If you find this repo useful for your research, please consider citing the paper.

@inproceedings{lv2024ptq4sam,
  title={PTQ4SAM: Post-Training Quantization for Segment Anything},
  author={Lv, Chengtao and Chen, Hong and Guo, Jinyang and Ding, Yifu and Liu, Xianglong},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={15941--15951},
  year={2024}
}

Acknowledgments

The code of PTQ4SAM was based on Prompt-Segment-Anything and QDrop. We thank for their open-sourced code.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PTQ4SAM: Post-Training Quantization for Segment Anything (CVPR 2024)

Overview

Create Environment

Prepare Dataset and Models

Usage

Reference

Acknowledgments

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
exp		exp
img		img
mmdetection		mmdetection
projects		projects
ptq4sam		ptq4sam
requirements		requirements
.gitignore		.gitignore
README.md		README.md
ckpt		ckpt
environment.sh		environment.sh
requirements.txt		requirements.txt

chengtao-lv/PTQ4SAM

Folders and files

Latest commit

History

Repository files navigation

PTQ4SAM: Post-Training Quantization for Segment Anything (CVPR 2024)

Overview

Create Environment

Prepare Dataset and Models

Usage

Reference

Acknowledgments

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages