TSP-Transformer

Abstract

Holistic scene understanding includes semantic segmentation, surface normal estimation, object boundary detection, depth estimation, etc. The key aspect of this problem is to learn representation effectively, as each subtask builds upon not only correlated but also distinct attributes. Inspired by visual-prompt tuning, we propose a Task-Specific Prompts Transformer, dubbed TSP-Transformer, for holistic scene understanding. It features a vanilla transformer in the early stage and tasks-specific prompts transformer encoder in the lateral stage, where tasks-specific prompts are augmented. By doing so, the transformer layer learns the generic information from the shared parts and is endowed with task-specific capacity. First, the tasks-specific prompts serve as induced priors for each task effectively. Moreover, the task-specific prompts can be seen as switches to favor task-specific representation learning for different tasks. Extensive experiments on NYUD-v2 and PASCAL-Context show that our method achieves state-of-the-art performance, validating the effectiveness of our method for holistic scene understanding.

Paper

Setup

Tested with PyTorch 1.11 and CUDA 11.3:

git clone https://github.com/tb2-sy/TSP-Transformer.git
conda create -n tsp python=3.7
conda activate tsp

conda install pytorch==1.11.0 torchvision==0.12.0 torchaudio==0.11.0 cudatoolkit=11.3 -c pytorch

pip install tqdm Pillow easydict pyyaml imageio scikit-image tensorboard
pip install opencv-python==4.5.4.60 setuptools==59.5.0

pip install timm==0.5.4 einops==0.4.1

Dataset

We use the same dataset (PASCAL-Context and NYUD-v2) as InvPT. You can download the data from here. And then extract the datasets by:

tar xfvz NYUDv2.tar.gz
tar xfvz PASCALContext.tar.gz

You need to specify the dataset directory as db_root variable in configs/mypath.py.

Training

Set the config files in ./conifigs, with PASCAL-Context and NYUD-v2 dataset.

# Train the NYUD-v2 dataset
./run_nyud.sh

# Train the PASCAL-Context dataset
./run_pascal.sh

Evaluation

# Evaluation the NYUD-v2 dataset
./infer_nyud.sh

# Evaluation the PASCAL-Context dataset
./infer_pascal.sh

Pre-trained models

Please download the weights of our SOTA results.

Version	Dataset	Download	Segmentation	Human parsing	Saliency	Normals	Boundary
TSP-Transformer (our paper)	PASCAL-Context	google drive	81.48	70.64	84.86	13.69	74.80
TaskPrompter (ICLR 2023)	PASCAL-Context	-	80.89	68.89	84.83	13.72	73.50
InvPT (ECCV 2022)	PASCAL-Context	-	79.03	67.61	84.81	14.15	73.00

Version	Dataset	Download	Segmentation	Depth	Normals	Boundary
TSP-Transformer (our paper)	NYUD-v2	google drive	55.39	0.4961	18.44	77.50
TaskPrompter (ICLR 2023)	NYUD-v2	-	55.30	0.5152	18.47	78.20
InvPT (ECCV 2022)	NYUD-v2	-	53.56	0.5183	19.04	78.10

TODO/Future work

Contact

For any questions related to our paper and implementation, please email [email protected].

Citation

If you find our code or paper helps, please consider citing:

@article{wang2023tsp,
  title={TSP-Transformer: Task-Specific Prompts Boosted Transformer for Holistic Scene Understanding},
  author={Wang, Shuo and Li, Jing and Zhao, Zibo and Lian, Dongze and Huang, Binbin and Wang, Xiaomei and Li, Zhengxin and Gao, Shenghua},
  journal={arXiv preprint arXiv:2311.03427},
  year={2023}
}

Acknowledgements

The code is available under the MIT license and draws from InvPT, ATRC, and MIT-Net, which are also licensed under the MIT license.

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
configs		configs
data		data
evaluation		evaluation
losses		losses
models		models
utils		utils
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
infer_nyud.sh		infer_nyud.sh
infer_pascal.sh		infer_pascal.sh
main.py		main.py
run_nyud.sh		run_nyud.sh
run_pascal.sh		run_pascal.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TSP-Transformer

Abstract

Paper

Setup

Dataset

Training

Evaluation

Pre-trained models

TODO/Future work

Contact

Citation

Acknowledgements

About

Releases

Packages

Languages

License

tb2-sy/TSP-Transformer

Folders and files

Latest commit

History

Repository files navigation

TSP-Transformer

Abstract

Paper

Setup

Dataset

Training

Evaluation

Pre-trained models

TODO/Future work

Contact

Citation

Acknowledgements

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages