Add timm_wrapper support to AutoFeatureExtractor #35764

Factral · 2025-01-18T05:20:43Z

What does this PR do?

A few days ago, the PR that adds timm_wrapper was merged #34564 blog post , enabling the use of timm models directly with Hugging Face interfaces, especially the Auto* ones. However, currently the AutoFeatureExtractor interface doesn't work with these models. This PR addresses that gap.

This PR adds timm_wrapper compatibility to AutoFeatureExtractor.from_pretrained(), enabling it to work with fine-tuned/trained timm model checkpoints.

Currently, when using a checkpoint from a trained/fine-tuned timm model (e.g., using examples/pytorch/image-classification/run_image_classification.py), AutoFeatureExtractor.from_pretrained() fails because timm_wrapper is not included in the interface.

While there's a warning about missing preprocessor_config.json in checkpoints, users can manually add it to their checkpoint following examples like https://huggingface.co/Factral/vit_large-model/blob/main/preprocessor_config.json. This PR ensures AutoFeatureExtractor works properly when this file is present.

Changes

Added timm_wrapper to AutoFeatureExtractor interface
Enables compatibility with timm model checkpoints when preprocessor_config.json is present
Added is_timm kwarg in from_dict function

Before submitting

Read contributor guidelines
Updated documentation to reflect changes
Added necessary tests for timm_wrapper functionality

Who can review?

@amyeroberts @qubvel - as this relates to vision models and timm integration

qubvel

Hi @Factral, thanks for submitting the PR!

I would recommend using AutoModel + AutoProcessor to get features from any Timm model. It works without adding preprocessing_config.json. Othrewise, we need to come up with the scheme to import FeatureExtractor the same way, without adding preprocessing_config.json to the repo on Hub, because preprocessing config is stored in config.json for timm models (please see how this was enabled for other AutoProcessors in original PR #34564)

qubvel · 2025-01-20T16:45:17Z

import torch
from PIL import Image
from transformers import AutoProcessor, AutoModel

checkpoint = "timm/resnet18.a1_in1k"

model = AutoModel.from_pretrained(checkpoint)
processor = AutoProcessor.from_pretrained(checkpoint)

# load your image here
image = Image.new("RGB", (224, 224), (255, 0, 0))

inputs = processor(image, return_tensors="pt")

with torch.no_grad():
    outputs = model(**inputs)

for k, v in outputs.items():
    print(k, v.shape)

# last_hidden_state torch.Size([1, 512, 7, 7])
# pooler_output torch.Size([1, 512])

Factral added 3 commits January 17, 2025 23:45

added timm_wrapper

bdd9a8a

override from_dict class

a32f3dc

handle is_timm kwarg

29b7858

Factral requested review from ArthurZucker, qubvel and Rocketknight1 as code owners January 18, 2025 05:20

qubvel removed request for Rocketknight1 and ArthurZucker January 20, 2025 16:02

qubvel reviewed Jan 20, 2025

View reviewed changes

qubvel added the Vision label Jan 21, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add timm_wrapper support to AutoFeatureExtractor #35764

Add timm_wrapper support to AutoFeatureExtractor #35764

Factral commented Jan 18, 2025 •

edited

Loading

qubvel left a comment •

edited

Loading

qubvel commented Jan 20, 2025

Add timm_wrapper support to AutoFeatureExtractor #35764

Are you sure you want to change the base?

Add timm_wrapper support to AutoFeatureExtractor #35764

Conversation

Factral commented Jan 18, 2025 • edited Loading

What does this PR do?

Changes

Before submitting

Who can review?

qubvel left a comment • edited Loading

Choose a reason for hiding this comment

qubvel commented Jan 20, 2025

Factral commented Jan 18, 2025 •

edited

Loading

qubvel left a comment •

edited

Loading