This repository provides the official implementation of our recent papers:
Anja Delić, Matej Grcić, Siniša Šegvić
Published in BMVC 2024
[arXiv
]
Matej Grcić, Josip Šarić, Siniša Šegvić
Published in CVPR workshop (VAND) 2023
[arXiv
]
Similar to Mask2Former repo, see installation instructions.
Finetuning a model with the additional negative objectness class with ADE20K negatives:
python finetune_UNO.py --config-file configs/cityscapes/semantic-segmentation/swin/maskformer2_swin_large_IN21k_384_bs18_2k_city+vistas_uno.yaml --num-gpus 3
Finetuning a model with the additional negative objectness class with synthetic negatives:
python finetune_UNO_synthetic.py --config-file configs/cityscapes/semantic-segmentation/swin/maskformer2_swin_large_IN21k_384_bs18_2k_city+vistas_uno_synthetic.yaml --num-gpus 3
Rejecting predictions in negative instances:
python train_net.py --config-file configs/cityscapes/semantic-segmentation/swin/maskformer2_swin_large_IN21k_384_bs12_2k_city+vistas_oe.yaml
python train_net.py --config-file configs/cityscapes/semantic-segmentation/swin/maskformer2_swin_large_IN21k_384_bs18_115k_city+_vistas_uno.yaml --eval-only MODEL.WEIGHTS path_to_model DATASETS.TEST eval_dataset_name
python train_net.py --config-file configs/cityscapes/semantic-segmentation/swin/maskformer2_swin_large_IN21k_384_bs18_115k_city+vistas.yaml --eval-only MODEL.WEIGHTS path_to_model DATASETS.TEST eval_dataset_name
eval_dataset_name
can be one of the following:
("fs_static_val", "fs_laf_val", "road_anomaly",)
Mask2Former with SWIN-L backbone trained on Cityscapes (CS): weights
Mask2Former with SWIN-L backbone trained on Cityscapes and Vistas (CS&MV): weights
Mask2Former with SWIN-L backbone (CS&MV) fine-tuned with ADE20k negatives: weights
Mask2Former with SWIN-L backbone (CS&MV) with K+2 classes fine-tuned with ADE20k negatives: weights
Mask2Former with SWIN-L backbone (CS&MV) with K+2 classes fine-tuned with synthetic negatives: weights
DenseFlow pretrained on CS&MV: weights
The majority of Mask2Former is licensed under a MIT License.
However portions of the project are available under separate license terms: Swin-Transformer-Semantic-Segmentation is licensed under the MIT license, Deformable-DETR is licensed under the Apache-2.0 License.
@inproceedings{delic24bmvc,
title={Outlier detection by ensembling uncertainty with negative objectness},
author={Anja Delić and Matej Grcic and Siniša Šegvić}
journal={BMVC 2024 British Machine Vision Conference},
year={2024}
}
@inproceedings{grcic23cvprw,
title={On Advantages of Mask-level Recognition for Outlier-aware Segmentation},
author={Matej Grcic and Josip Šarić and Siniša Šegvić}
journal={CVPR 2023 Workshop Visual Anomaly and Novelty Detection (VAND)},
year={2023}
}
@inproceedings{cheng2021mask2former,
title={Masked-attention Mask Transformer for Universal Image Segmentation},
author={Bowen Cheng and Ishan Misra and Alexander G. Schwing and Alexander Kirillov and Rohit Girdhar},
journal={CVPR},
year={2022}
}
Code is extension of Mask2Former (https://github.com/facebookresearch/Mask2Former).