This code is the official repository for our C²DA: Contrastive and Context-Aware Learning for Domain Adaptive Semantic Segmentation.
Unsupervised domain adaptive semantic segmentation (UDA-SS) aims to train a model on the source domain data (e.g., synthetic) and adapt the model to predict target domain data(e.g. real-world) without accessing target annotation data. Most existing UDA-SS methods only focus on the inter-domain knowledge to mitigate the data-shift problem. However, learning the inherent structure of the images and exploring the intrinsic pixel distribution of both domains are ignored; which prevents the UDA-SS methods from producing satisfactory performance like supervised learning. Moreover, incorporating contextual knowledge is also often overlooked. Considering the issues, in this work, we propose a UDA-SS framework that learns both intra-domain and context-aware knowledge. To learn the intra-domain knowledge, we incorporate contrastive loss in both domains, which pulls pixels of similar classes together and pushes the rest away, facilitating intra-image-pixel-wise correlations. To learn context-aware knowledge, we modify the mixing technique by leveraging contextual dependency among the classes to learn context-aware knowledge. Moreover, we adapt the Mask Image Modeling (MIM) technique to properly use context clues for robust visual recognition, using limited information about the masked images.
We recommend setting up a new virtual environment. In that environment, the requirements can be installed with:
pip install -r requirements.txt -f https://download.pytorch.org/whl/torch_stable.html
pip install mmcv-full==1.3.7 # requires the other packages to be installed first
Further, please download the MiT weights from SegFormer using the following script. If problems occur with the automatic download, please follow the instructions for a manual download within the script.
sh tools/download_checkpoints.sh
Cityscapes: Please, download leftImg8bit_trainvaltest.zip and gt_trainvaltest.zip from here and extract them to
data/cityscapes
GTA: Please, download all image and label packages from here and extract them to
data/gta
RUGD: Please, download all image and label packages from here and extract them to
data/rugd
MESH: Please, download all image and label packages from here and extract them to
data/MESH
The final folder structure should look like this:
DEDA
├── ...
├── data
│
│ ├── cityscapes
│ │ ├── leftImg8bit
│ │ │ ├── train
│ │ │ ├── val
│ │ ├── gtFine
│ │ │ ├── train
│ │ │ ├── val
│
│ ├── gta
│ │ ├── images
│ │ ├── labels
│ ├── rugd
│ │ ├── images
│ │ ├── labels
│ ├── MESH
│ │ ├── images
│ │ ├── labels
│ │
├──
Data Preprocessing: Finally, please run the following scripts to convert the label IDs to the train IDs and to generate the class index for RCS:
#Testing & Predictions
python tools/convert_datasets/gta.py data/gta --nproc 8
python tools/convert_datasets/cityscapes.py data/cityscapes --nproc 8
A training job for gta2cs can be launched using:
python run_experiments.py --config configs/C²DA/gtaHR2csHR_hrda.py
and a training job for rugd2mesh can be launched using:
python run_experiments.py --config configs/C²DA/rugd2mesh_hrda.py
The logs and checkpoints are stored in
work_dirs/
The provided IDA checkpoint trained on GTA→Cityscapes can be tested on the Cityscapes validation set using:
sh test.sh work_dirs/gtaHR2csHR_hrda_246ef
And the provided IDA checkpoint trained on RUGD→MESH can be tested on the MESH validation set using:
sh test.sh work_dirs/rugdHR2meshHR_hrda_246ef
The trained segmentation model can be used for visual navigation by using:
sh in_ros.sh work_dirs/rugdHR2meshHR_hrda_246ef
This project is based on mmsegmentation version 0.16.0. For more information about the framework structure and the config system, please refer to the mmsegmentation documentation and the mmcv documentation.
The most relevant files for C²DA are:
configs/C²DA/gtaHR2csHR_hrda.py: Annotated config file for the final C²DA.
configs/C²DA/rugd2mesh_hrda.py: Annotated config file for the final C²DA.
mmseg/models/segmentors/hrda_encoder_decoder.py: Implementation of the HRDA multi-resolution encoding with context and detail crop.
mmseg/models/decode_heads/hrda_head.py: Implementation of the HRDA decoding with multi-resolution fusion and scale attention.
mmseg/models/uda/dacs.py: Implementation of the DAFormer self-training.
C²DA is based on the following open-source projects. We thank their authors for making the source code publicly available.