Different resize option for better reproducibility #1148

hankyul2 · 2022-02-18T00:55:07Z

hankyul2
Feb 18, 2022

Hi I have experimented to see how top-1 validation accuracy changed with different preprocessing steps.

Question

My question is if there is a plan to add validate-resize-mode option in validate.py to change between resize-only-mode and resize-center-crop-mode. I am asking this because I think some models ported from tensorflow work better in resize-only-mode.

	timm	resize-center-crop	resize-only	original
tf_efficientnetv2_s	83.886	83.89	83.738	83.9
tf_efficientnetv2_m	85.036	85.036	85.208	85.2
tf_efficientnetv2_l	85.490	85.496	85.666	85.7
tf_efficientnetv2_s_in21ft1k	84.298	84.284	84.748	84.9
tf_efficientnetv2_m_in21ft1k	85.592	85.588	86.002	86.2
tf_efficientnetv2_l_in21ft1k	86.304	86.31	86.808	86.9
tf_efficientnetv2_xl_in21ft1k	86.418	86.428	86.752	87.2

I also found that tf_efficientnetv2_l_in21tf1k works better with original(tf version) normalization option. (86.808 -> 86.822)

Experiment setup

I just add following code to original validate.py. (full code) This code is made for experiment only.

resize : resize option.
- resize : resize-only-mode
- resize_shorter : resize-center-crop-mode (references)
divide : normalization option. you can choose normalization process.
- timm : mean=(0.5, 0.5, 0.5), std=(0.5, 0.5, 0.5)
- original : mean=(0.502, 0.502, 0.502), std=(0.502, 0.502, 0.502) (references)

from torchvision import transforms
from torchvision.transforms.functional import InterpolationMode
import numpy as np

class MyToTensor:
    def __init__(self, option):
        self.mean = torch.tensor(127.5).half()
        self.std = torch.tensor(127.5).half()
        self.option = option
    def __call__(self, pic):
        img = torch.from_numpy(np.array(pic, np.uint8, copy=True))
        img = img.permute((2, 0, 1)).contiguous()
        if self.option == 'timm':
            return img.half().sub_(self.mean).div_(self.std)
        elif self.option == 'original':
            return (img.to(torch.float) - 128.0) / 128.0
        elif self.option == 'torchvision':
            return (img.to(torch.float).div(255) - 0.5) / 0.5

interpolation_dict = {
    'bilinear': InterpolationMode.BILINEAR,
    'bicubic': InterpolationMode.BICUBIC
}
interpolation = interpolation_dict[data_config['interpolation']]

if args.resize == 'resize_shorter':
    loader.dataset.transform = transforms.Compose([
        transforms.Resize(data_config['input_size'][-2], interpolation=interpolation),
        transforms.CenterCrop(data_config['input_size'][-2:]),
        MyToTensor(args.divide)
    ])
elif args.resize == 'resize':
    loader.dataset.transform = transforms.Compose([
        transforms.Resize(data_config['input_size'][-2:], interpolation=interpolation),
        MyToTensor(args.divide)
    ])

Experiment Results

resize=resize_shorter, divide=timm

python3 validate.py imageNet --divide timm --resize resize_shorter --model model_list.txt --results-file resize_shorter_timm.csv --pretrained --amp --channels-last --cuda 8 --interpolation bicubic --no-prefetcher

model	top1	top1_err	top5	top5_err	param_count	img_size	cropt_pct	interpolation
tf_efficientnetv2_xl_in21ft1k	86.428	13.572	97.866	2.134	208.12	512	1	bicubic
tf_efficientnetv2_l_in21ft1k	86.31	13.69	97.978	2.022	118.52	480	1	bicubic
tf_efficientnetv2_m_in21ft1k	85.588	14.412	97.746	2.254	54.14	480	1	bicubic
tf_efficientnetv2_l	85.496	14.504	97.372	2.628	118.52	480	1	bicubic
tf_efficientnetv2_m	85.036	14.964	97.272	2.728	54.14	480	1	bicubic
tf_efficientnetv2_s_in21ft1k	84.284	15.716	97.252	2.748	21.46	384	1	bicubic
tf_efficientnetv2_s	83.89	16.11	96.694	3.306	21.46	384	1	bicubic

resize=resize, divide=timm

python3 validate.py imageNet --divide timm --model model_list.txt --results-file resize_timm.csv --pretrained --amp --channels-last --cuda 8 --interpolation bicubic --no-prefetcher

model	top1	top1_err	top5	top5_err	param_count	img_size	cropt_pct	interpolation
tf_efficientnetv2_l_in21ft1k	86.808	13.192	98.132	1.868	118.52	480	1	bicubic
tf_efficientnetv2_xl_in21ft1k	86.752	13.248	98.016	1.984	208.12	512	1	bicubic
tf_efficientnetv2_m_in21ft1k	86.002	13.998	97.948	2.052	54.14	480	1	bicubic
tf_efficientnetv2_l	85.666	14.334	97.474	2.526	118.52	480	1	bicubic
tf_efficientnetv2_m	85.208	14.792	97.362	2.638	54.14	480	1	bicubic
tf_efficientnetv2_s_in21ft1k	84.748	15.252	97.464	2.536	21.46	384	1	bicubic
tf_efficientnetv2_s	83.738	16.262	96.646	3.354	21.46	384	1	bicubic

resize=resize, divide=original

python3 validate.py imageNet --divide original --model model_list.txt --results-file resize_original.csv --pretrained --amp --channels-last --cuda 8 --interpolation bicubic --no-prefetcher

model	top1	top1_err	top5	top5_err	param_count	img_size	cropt_pct	interpolation
tf_efficientnetv2_l_in21ft1k	86.822	13.178	98.14	1.86	118.52	480	1	bicubic
tf_efficientnetv2_xl_in21ft1k	86.756	13.244	98.014	1.986	208.12	512	1	bicubic
tf_efficientnetv2_m_in21ft1k	85.992	14.008	97.95	2.05	54.14	480	1	bicubic
tf_efficientnetv2_l	85.658	14.342	97.474	2.526	118.52	480	1	bicubic
tf_efficientnetv2_m	85.206	14.794	97.36	2.64	54.14	480	1	bicubic
tf_efficientnetv2_s_in21ft1k	84.748	15.252	97.464	2.536	21.46	384	1	bicubic
tf_efficientnetv2_s	83.732	16.268	96.644	3.356	21.46	384	1	bicubic

Thank you.

Answered by rwightman

Feb 18, 2022

@hankyul2 I'm aware of this but chose not to add support since it is not trivial to support properly (via pretrained default_cfg entries, arg pass through to transform factory).

I also don't agree with deviating from what has been a fairly consistent standard re ImageNet eval, especially when it doesn't preserve aspect ratios... efficientnet v2 are not the only models, the deepmind nfnet weights and convnext weights (384x384 only) also rely on different preprocess.

Something I may do in the future, but hasn't been a priority.

View full answer

rwightman · 2022-02-18T05:24:02Z

rwightman
Feb 18, 2022
Maintainer

@hankyul2 I'm aware of this but chose not to add support since it is not trivial to support properly (via pretrained default_cfg entries, arg pass through to transform factory).

I also don't agree with deviating from what has been a fairly consistent standard re ImageNet eval, especially when it doesn't preserve aspect ratios... efficientnet v2 are not the only models, the deepmind nfnet weights and convnext weights (384x384 only) also rely on different preprocess.

Something I may do in the future, but hasn't been a priority.

2 replies

hankyul2 Feb 18, 2022
Author

Thank you for your quick reply, @rwightman
I think as same as you, but just wanted to know how do you think about it.

rwightman Feb 18, 2022
Maintainer

@hankyul2 Thanks for the analysis, it probably took a bit as those models are on the large side!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Different resize option for better reproducibility #1148

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 1 comment 2 replies

{{title}}

{{title}}

{{title}}

Select a reply

Different resize option for better reproducibility #1148

hankyul2 Feb 18, 2022

Question

Experiment setup

Experiment Results

Replies: 1 comment · 2 replies

rwightman Feb 18, 2022 Maintainer

hankyul2 Feb 18, 2022 Author

rwightman Feb 18, 2022 Maintainer

hankyul2
Feb 18, 2022

Replies: 1 comment 2 replies

rwightman
Feb 18, 2022
Maintainer

hankyul2 Feb 18, 2022
Author

rwightman Feb 18, 2022
Maintainer