Skip to content

Official pytorch implementation of the paper "Analogous to Evolutionary Algorithm: Designing a Unified Sequence Model, NeurIPS'21".

License

Notifications You must be signed in to change notification settings

TencentYoutuResearch/BaseArchitecture-EAT

Repository files navigation

Analogous to Evolutionary Algorithm: Designing a Unified Sequence Model

Python 3.6.8 PyTorch 1.6.0

Official PyTorch implementation of "Analogous to Evolutionary Algorithm: Designing a Unified Sequence Model, NeurIPS'21".

EAT

Model Zoo

We provide EAT variants pre-trained on ImageNet-1k.

Models & Urls Top-1 Params. Throughput
(GPU)
Throughput
(CPU)
Inference Memory Occupancy Image Size
EAT-Ti 72.7 5.7M 2442 95.4 1448 224 x 224
EAT-S 80.4 22.1M 1001 34.4 1708 224 x 224
EAT-M 82.1 49.0M 519 18.4 2114 224 x 224
EAT-B 82.0 86.6M 329 11.7 2508 224 x 224
EAT-Ti (384) 75.8 5.8M 721 20.5 1930 224 x 224
EAT-S (384) 82.4 22.2M 312 8.7 2466 224 x 224
EAT-Ti (Dist) 74.8 5.7M 2442 95.4 1448 224 x 224
EAT-S (Dist) 81.2 22.1M 1001 34.4 1708 224 x 224
EAT-Ti (1000 epochs) 75.0 5.7M 2442 95.4 1448 224 x 224
EAT-Ti (Dist+1000 epochs) 77.0 5.7M 2442 95.4 1448 224 x 224

Using the Code

Requirements

  • This code has been developed under python3.6.8, pytorch 1.6.0+cu101, torchvision 0.7.0.
# Install python3 packages
pip3 install -r requirements.txt
  • Refer to DCNv2_latest if you want to use DCN as the local operator.

  • We copy timm to protect against the potential impact of subsequent release updates.

Data preparation

Download and extract ImageNet-1k dataset in the following directory structure:

├── imagenet
    ├── train
        ├── n01440764
            ├── n01440764_10026.JPEG
            ├── ...
        ├── ...
    ├── train.txt (optional)
    ├── val
        ├── n01440764
            ├── ILSVRC2012_val_00000293.JPEG
            ├── ...
        ├── ...
    └── val.txt (optional)

Evaluation

  • To evaluate the pre-trained EAT-Ti on ImageNet-1k with a single GPU:
python3 main.py --model eat_progress3_patch16_224 --resume weights/EAT-Ti.pth --data-path /path/to/imagenet --output_dir ./tmp --eval  

or with multiple GPUs:

python3 -m torch.distributed.launch --nproc_per_node=8 --use_env main.py --model eat_progress3_patch16_224 --resume weights/EAT-Ti.pth --data-path /path/to/imagenet --dist_url tcp://127.0.0.1:23456 --output_dir ./tmp --eval 

This should give Top-1 72.876

  • To evaluate the pre-trained EAT-Ti (with distillation) on ImageNet-1k with a single GPU:
python3 main.py --model eat_progress3_patch16_224_dist --resume weights/EAT-Ti-dist-1000e.pth --data-path /path/to/imagenet --output_dir ./tmp --eval  

or with multiple GPUs:

python3 -m torch.distributed.launch --nproc_per_node=8 --use_env main.py --model eat_progress3_patch16_224_dist --resume weights/EAT-Ti-dist-1000e.pth --data-path /path/to/imagenet --dist_url tcp://127.0.0.1:23456 --output_dir ./tmp --eval 

This should give Top-1 76.856

  • More evaluation commands:
EAT-Ti
python3 main.py --model eat_progress3_patch16_224 --resume weights/EAT-Ti.pth --data-path /path/to/imagenet --output_dir ./tmp --eval  

This should give Top-1 72.876

EAT-S
python3 main.py --model eat_progress6_patch16_224 --resume weights/EAT-S.pth --data-path /path/to/imagenet --output_dir ./tmp --eval  

This should give Top-1 80.422

EAT-M
python3 main.py --model eat_progress9_patch16_224 --resume weights/EAT-M.pth --data-path /path/to/imagenet --output_dir ./tmp --eval  

This should give Top-1 82.052

EAT-B
python3 main.py --model eat_progress12_patch16_224 --resume weights/EAT-B.pth --data-path /path/to/imagenet --output_dir ./tmp --eval  

This should give Top-1 82.026

EAT-Ti-384
python3 main.py --model eat_progress3_patch16_384 --input-size 384 --resume weights/EAT-Ti-384.pth --data-path /path/to/imagenet --output_dir ./tmp --eval  

This should give Top-1 75.806

EAT-S-384
python3 main.py --model eat_progress6_patch16_384 --input-size 384 --resume weights/EAT-S-384.pth --data-path /path/to/imagenet --output_dir ./tmp --eval  

This should give Top-1 82.398

EAT-Ti-dist
python3 main.py --model eat_progress3_patch16_224_dist --resume weights/EAT-Ti-dist.pth --data-path /path/to/imagenet --output_dir ./tmp --eval  

This should give Top-1 74.834

EAT-S-dist
python3 main.py --model eat_progress6_patch16_224_dist --resume weights/EAT-S-dist.pth --data-path /path/to/imagenet --output_dir ./tmp --eval  

This should give Top-1 81.206

EAT-Ti-1000e
python3 main.py --model eat_progress3_patch16_224 --resume weights/EAT-Ti-1000e.pth --data-path /path/to/imagenet --output_dir ./tmp --eval  

This should give Top-1 74.990

EAT-Ti-dist-1000e
python3 main.py --model eat_progress3_patch16_224_dist --resume weights/EAT-Ti-dist-1000e1.pth --data-path /path/to/imagenet --output_dir ./tmp --eval  

This should give Top-1 77.030

Training

To train EAT-Ti and EAT-Ti-dist on ImageNet-1k on a single node with 8 GPUs:

EAT-Ti

python3 -m torch.distributed.launch --nproc_per_node=8 --use_env main.py --model eat_progress3_patch16_224 --batch-size 128 --input-size 224 --opt adamw --lr 5e-4 --loc_encoder sis --epochs 300 --cls_token False --cls_token_head True --depth_cls 2 --block_type base_local --local_type conv --local_ratio 0.5 --mlp_ratio 4 --data-path /path/to/imagenet --output_dir /path/to/save

EAT-Ti-dist

python3 -m torch.distributed.launch --nproc_per_node=8 --use_env main.py --model eat_progress3_patch16_224_dist --batch-size 128 --input-size 224 --opt adamw --lr 5e-4 --loc_encoder sis --epochs 300 --cls_token False --cls_token_head True --depth_cls 2 --block_type base_local --local_type conv --local_ratio 0.5 --mlp_ratio 4 --distillation_type hard --teacher_model regnety_160 --teacher_path /path/to/teacher/model/weights --data-path /path/to/imagenet --output_dir /path/to/save

Citation

If our work is helpful for your research, please consider citing:

@misc{zhang2021analogous,
      title={Analogous to Evolutionary Algorithm: Designing a Unified Sequence Model}, 
      author={Jiangning Zhang and Chao Xu and Jian Li and Wenzhou Chen and Yabiao Wang and Ying Tai and Shuo Chen and Chengjie Wang and Feiyue Huang and Yong Liu},
      year={2021},
      eprint={2105.15089},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Acknowledgements

We thank following works for providing assistance for our research:

About

Official pytorch implementation of the paper "Analogous to Evolutionary Algorithm: Designing a Unified Sequence Model, NeurIPS'21".

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages