Support transformers 4.43 #1971

IlyasMoutawwakil · 2024-07-25T09:06:06Z

What does this PR do?

This PR adds support for transformers 4.43 with which the following issues emerge:

clip models using sdpa attention.
pipeline calling model.dtype to convert inputs to the correct floating point precision
bark models can't be saved due to shared tensors (track in BarkModel can't be saved anymore transformers#32224)
whisper introducing a new forward input argument cache_position.

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you make sure to update the documentation with your changes?
Did you write any new necessary tests?

Who can review?

HuggingFaceDocBuilderDev · 2024-07-25T10:47:19Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

IlyasMoutawwakil · 2024-07-25T13:03:57Z

OwlViT and OwlV2 need an upgrade to their onnx config (opset=11->12 for the einsum operator), I will push it once the onnxruntime tests finish (to see if there's anything left).

echarlaix

Thanks a lot for taking care of this @IlyasMoutawwakil

optimum/exporters/utils.py

optimum/onnxruntime/modeling_diffusion.py

optimum/onnxruntime/modeling_seq2seq.py

…eneration loop

… speech to text

IlyasMoutawwakil · 2024-07-29T15:11:40Z

should be all fixed now

echarlaix

Looks great thanks a lot @IlyasMoutawwakil

optimum/utils/input_generators.py

optimum/exporters/onnx/config.py

echarlaix · 2024-07-30T10:35:13Z

optimum/onnxruntime/base.py

+            if use_torch is True:
+                cache_position = cache_position.to(self.device)
+
+        return use_cache_branch_tensor, past_key_values, cache_position


this is a breaking change so we should be careful, not sure this method is used by anyone though

the method is only used by the forward pass, I don't think any sub packages use it

optimum/onnxruntime/modeling_seq2seq.py

…gface/optimum into support-transformers-4.43

optimum/exporters/onnx/config.py

sreenivasulureddysura · 2024-07-31T13:46:39Z

@IlyasMoutawwakil Is this PR ready to support transformers 4.43.3 version?

optimum/onnxruntime/modeling_seq2seq.py

IlyasMoutawwakil · 2024-08-02T11:08:50Z

@sreenivasulureddysura yep it's ready

* fix bt bark test * setup * patch clip models for sd * infer ort model dtype property from inputs dtypes * patch all clip variants * device setter * bigger model for now * fix device attribution * onnx opset for owlvit and owlv2 * model dtype * revert * use model part dtype instead * no need for dtype with diffusion pipelines * revert * fix clip text model with projection not outputting hidden states * whisper generation * fix whisper, support cache_position, and using transformers whisper generation loop * style * create cache position for merged decoder and fix test for non whisper speech to text * typo * conditioned cache position argument * update whisper min transformers version * compare whisper ort generation with transformers * fix generation length for speech to text model type * cache position in whisper only with dynamic axis decoder_sequence_length * use minimal prepare_inputs_for_generation in ORTModelForSpeechSeq2Seq * remove version restrictions on whisper * comment * fix * simpler --------- Co-authored-by: Ella Charlaix <[email protected]>

Support the `cache_position` input that was added to Hugging Face Whisper models as part of a revision of how it handles KV-caching. This is like `position_ids`, but there is no batch dimension. See huggingface/optimum#1971 and huggingface/transformers#31166.

IlyasMoutawwakil added 3 commits July 25, 2024 11:04

fix bt bark test

9333b58

setup

4dda6df

patch clip models for sd

5926bc5

IlyasMoutawwakil added 4 commits July 25, 2024 13:39

infer ort model dtype property from inputs dtypes

c2a5c03

patch all clip variants

b610212

device setter

9923084

bigger model for now

0cb6be7

IlyasMoutawwakil mentioned this pull request Jul 25, 2024

Support transformers 4.43 #1968

Closed

3 tasks

IlyasMoutawwakil added 2 commits July 25, 2024 14:31

fix device attribution

88831a5

onnx opset for owlvit and owlv2

a1f838c

IlyasMoutawwakil added 3 commits July 25, 2024 15:32

model dtype

b8f5f32

revert

81d0227

use model part dtype instead

82a2879

echarlaix approved these changes Jul 25, 2024

View reviewed changes

optimum/exporters/utils.py Show resolved Hide resolved

optimum/onnxruntime/modeling_diffusion.py Outdated Show resolved Hide resolved

IlyasMoutawwakil added 4 commits July 25, 2024 15:56

no need for dtype with diffusion pipelines

d2a15b5

revert

c761026

fix clip text model with projection not outputting hidden states

0eb5dce

whisper generation

f568bf6

echarlaix reviewed Jul 26, 2024

View reviewed changes

optimum/onnxruntime/modeling_seq2seq.py Outdated Show resolved Hide resolved

IlyasMoutawwakil added 3 commits July 29, 2024 14:49

fix whisper, support cache_position, and using transformers whisper g…

92ea60b

…eneration loop

style

170eaba

create cache position for merged decoder and fix test for non whisper…

991b66b

… speech to text

IlyasMoutawwakil requested a review from dacorvo July 29, 2024 15:12

typo

8f8e6ca

IlyasMoutawwakil mentioned this pull request Jul 30, 2024

Whisper-large-v3 transcript is trimmed #1972

Open

4 tasks

Merge branch 'main' into support-transformers-4.43

e5934b3

echarlaix approved these changes Jul 30, 2024

View reviewed changes

echarlaix requested review from fxmarty, michaelbenayoun and JingyaHuang July 30, 2024 11:03

IlyasMoutawwakil added 5 commits July 30, 2024 15:24

conditioned cache position argument

96bdde1

update whisper min transformers version

9d09389

compare whisper ort generation with transformers

056e450

Merge branch 'support-transformers-4.43' of https://github.com/huggin…

b3d9181

…gface/optimum into support-transformers-4.43

fix generation length for speech to text model type

825cc6d

echarlaix reviewed Jul 30, 2024

View reviewed changes

optimum/exporters/onnx/config.py Outdated Show resolved Hide resolved

cache position in whisper only with dynamic axis decoder_sequence_length

3fe0cac

IlyasMoutawwakil commented Aug 1, 2024

View reviewed changes

optimum/onnxruntime/modeling_seq2seq.py Outdated Show resolved Hide resolved

IlyasMoutawwakil mentioned this pull request Aug 2, 2024

Support Transformers 4.43 huggingface/optimum-intel#856

Merged

3 tasks

use minimal prepare_inputs_for_generation in ORTModelForSpeechSeq2Seq

b3948b9

IlyasMoutawwakil mentioned this pull request Aug 2, 2024

Latest Optimum library does not compatible with latest Transformers #1969

Closed

4 tasks

IlyasMoutawwakil added 4 commits August 2, 2024 23:25

remove version restrictions on whisper

2f69a8a

comment

4cc1065

fix

8077ded

simpler

aa9b9d6

echarlaix merged commit f8f9707 into main Aug 5, 2024
60 of 64 checks passed

echarlaix deleted the support-transformers-4.43 branch August 5, 2024 13:00

RyanMetcalfeInt8 mentioned this pull request Aug 7, 2024

llm-chatbot-generate-api notebook fails during pip / git install routine openvinotoolkit/openvino_notebooks#2273

Closed

IlyasMoutawwakil mentioned this pull request Aug 27, 2024

Cannot export sdxl encoder to onnx when transformers[torch] >= 4.43.0 (Occurred when translating scaled_dot_product_attention). huggingface/transformers#33089

Closed

4 tasks

robertknight mentioned this pull request Oct 27, 2024

Support cache_position inputs in Hugging Face models robertknight/rten#395

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support transformers 4.43 #1971

Support transformers 4.43 #1971

IlyasMoutawwakil commented Jul 25, 2024 •

edited

Loading

HuggingFaceDocBuilderDev commented Jul 25, 2024

IlyasMoutawwakil commented Jul 25, 2024

echarlaix left a comment

IlyasMoutawwakil commented Jul 29, 2024

echarlaix left a comment

echarlaix Jul 30, 2024

IlyasMoutawwakil Jul 30, 2024

sreenivasulureddysura commented Jul 31, 2024

IlyasMoutawwakil commented Aug 2, 2024

Support transformers 4.43 #1971

Support transformers 4.43 #1971

Conversation

IlyasMoutawwakil commented Jul 25, 2024 • edited Loading

What does this PR do?

Before submitting

Who can review?

HuggingFaceDocBuilderDev commented Jul 25, 2024

IlyasMoutawwakil commented Jul 25, 2024

echarlaix left a comment

Choose a reason for hiding this comment

IlyasMoutawwakil commented Jul 29, 2024

echarlaix left a comment

Choose a reason for hiding this comment

echarlaix Jul 30, 2024

Choose a reason for hiding this comment

IlyasMoutawwakil Jul 30, 2024

Choose a reason for hiding this comment

sreenivasulureddysura commented Jul 31, 2024

IlyasMoutawwakil commented Aug 2, 2024

IlyasMoutawwakil commented Jul 25, 2024 •

edited

Loading