Skip to content

Latest commit





Domain Generalization for Image Classification


It’s suggested to use pytorch==1.7.1 and torchvision==0.8.2 in order to reproduce the benchmark results.

Example scripts support all models in PyTorch-Image-Models. You also need to install timm to use PyTorch-Image-Models.

pip install timm


Following datasets can be downloaded automatically:

Supported Methods


The shell files give the script to reproduce the benchmark with specified hyper-parameters. For example, if you want to train IRM on Office-Home, use the following script

# Train with IRM on Office-Home Ar Cl Rw -> Pr task using ResNet 50.
# Assume you have put the datasets under the path `data/office-home`, 
# or you are glad to download the datasets automatically from the Internet to this path
CUDA_VISIBLE_DEVICES=0 python data/office-home -d OfficeHome -s Ar Cl Rw -t Pr -a resnet50 --seed 0 --log logs/irm/OfficeHome_Pr

Note that -s specifies the source domain, -t specifies the target domain, and --log specifies where to store results.

Experiment and Results

Following DomainBed, we select hyper-parameters based on the model's performance on training-domain validation set (first rule in DomainBed). Concretely, we save model with the highest accuracy on training-domain validation set and then load this checkpoint to test on the target domain.

Here are some differences between our implementation and DomainBed. For the model, we do not freeze BatchNorm2d layers and do not insert additional Dropout layer except for PACS dataset. For the optimizer, we use SGD with momentum by default and find this usually achieves better performance than Adam.


  • ERM refers to the model trained with data from the source domain.
  • Avg is the accuracy reported by TLlib.

PACS accuracy on ResNet-50

Methods avg A C P S
ERM 86.4 88.5 78.4 97.2 81.4
IBN 87.8 88.2 84.5 97.1 81.4
MixStyle 87.4 87.8 82.3 95.0 84.5
MLDG 87.2 88.2 81.4 96.6 82.5
IRM 86.9 88.0 82.5 98.0 79.0
VREx 87.0 87.2 82.3 97.4 81.0
GroupDRO 87.3 88.9 81.7 97.8 80.8
CORAL 86.4 89.1 80.0 97.4 79.1

Office-Home accuracy on ResNet-50

Methods avg A C P R
ERM 70.8 68.3 55.9 78.9 80.0
IBN 69.9 67.4 55.2 77.3 79.6
MixStyle 71.7 66.8 58.1 78.0 79.9
MLDG 70.3 65.9 57.6 78.2 79.6
IRM 70.3 66.7 54.8 78.6 80.9
VREx 70.2 66.9 54.9 78.2 80.9
GroupDRO 70.0 66.7 55.2 78.8 79.9
CORAL 70.9 68.3 55.4 78.8 81.0


If you use these methods in your research, please consider citing.

    author = {Xingang Pan, Ping Luo, Jianping Shi, and Xiaoou Tang},  
    title = {Two at Once: Enhancing Learning and Generalization Capacities via IBN-Net},  
    booktitle = {ECCV},  
    year = {2018}  

    title={Domain Generalization with MixStyle},
    author={Zhou, Kaiyang and Yang, Yongxin and Qiao, Yu and Xiang, Tao},

    title={Learning to Generalize: Meta-Learning for Domain Generalization},
    author={Li, Da and Yang, Yongxin and Song, Yi-Zhe and Hospedales, Timothy},
    title={Invariant Risk Minimization}, 
    author={Martin Arjovsky and Léon Bottou and Ishaan Gulrajani and David Lopez-Paz},

    title={Out-of-Distribution Generalization via Risk Extrapolation (REx)}, 
    author={David Krueger and Ethan Caballero and Joern-Henrik Jacobsen and Amy Zhang and Jonathan Binas and Dinghuai Zhang and Remi Le Priol and Aaron Courville},

    title={Distributionally Robust Neural Networks for Group Shifts: On the Importance of Regularization for Worst-Case Generalization}, 
    author={Shiori Sagawa and Pang Wei Koh and Tatsunori B. Hashimoto and Percy Liang},

    title={Deep coral: Correlation alignment for deep domain adaptation},
    author={Sun, Baochen and Saenko, Kate},