`BetterTransformer` optimizations can't be applied to `Falcon` #1543

pcuenca · 2023-11-16T14:17:17Z

System Info

Python 3.10, optimum @ main, transformers @ main

Who can help?

@fxmarty

Information

The official example scripts
My own modified scripts

Tasks

An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
My own task or dataset (give details below)

Reproduction (minimal, reproducible, runnable)

Reproduction:

from transformers import AutoTokenizer, AutoModelForCausalLM
from optimum.bettertransformer import BetterTransformer
import torch

model_id = "tiiuae/falcon-rw-1b"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)
model = BetterTransformer.transform(model)

inputs = tokenizer("Hello, my dog is cute", return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=10)

Falcon attention was refactored in huggingface/transformers@05ea7b7#diff-81c616a9db6f569c579ccf03c30c2f69aa7b65fa40959ac7e882fb8d541891d7. This removed the property maybe_rotary and adopted llama conventions for rotary embeddings.

We could modify the use of maybe_rotary here by using something like:

        submodules = ["query_key_value", "dense", "attention_dropout"]
        if not config.alibi:
            submodules.append("rotary_emb")

And then we'd need to adapt the code here, applying rotary embeddings when alibi is not in use.

Expected behavior

Transformation would work.

The text was updated successfully, but these errors were encountered:

fxmarty · 2023-12-13T16:14:53Z

Hi, Falcon with SDPA is supported by default in Transformers now huggingface/transformers#26572, and we deprecate the usage of BetterTransformer for this architecture.

See https://huggingface.co/docs/transformers/perf_infer_gpu_one#flashattention-and-memory-efficient-attention-through-pytorchs-scaleddotproductattention

pcuenca added the bug Something isn't working label Nov 16, 2023

fxmarty closed this as completed Dec 13, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`BetterTransformer` optimizations can't be applied to `Falcon` #1543

`BetterTransformer` optimizations can't be applied to `Falcon` #1543

pcuenca commented Nov 16, 2023

fxmarty commented Dec 13, 2023

BetterTransformer optimizations can't be applied to Falcon #1543

BetterTransformer optimizations can't be applied to Falcon #1543

Comments

pcuenca commented Nov 16, 2023

System Info

Who can help?

Information

Tasks

Reproduction (minimal, reproducible, runnable)

Expected behavior

fxmarty commented Dec 13, 2023

`BetterTransformer` optimizations can't be applied to `Falcon` #1543

`BetterTransformer` optimizations can't be applied to `Falcon` #1543