Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LlamaForCausalLM does not support Flash Attention 2.0 yet #28466

Closed
Patrick-Ni opened this issue Jan 12, 2024 · 1 comment
Closed

LlamaForCausalLM does not support Flash Attention 2.0 yet #28466

Patrick-Ni opened this issue Jan 12, 2024 · 1 comment

Comments

@Patrick-Ni
Copy link

The model was loaded with use_flash_attention_2=True, which is deprecated and may be removed in a future release. Please use attn_implementation="flash_attention_2" instead.
Traceback (most recent call last):
File "/root/paddlejob/workspace/env_run/benchmark/generation/main.py", line 116, in
main()
File "/root/paddlejob/workspace/env_run/benchmark/generation/main.py", line 91, in main
pipeline = load_model_and_tokenizer(model_home, args.model, args.use_pipeline)
File "/root/paddlejob/workspace/env_run/benchmark/generation/load_models_and_datasets.py", line 26, in load_model_and_tokenizer
model = AutoModelForCausalLM.from_pretrained(
File "/root/paddlejob/workspace/env_run/lib/python3.9/site-packages/transformers/models/auto/auto_factory.py", line 561, in from_pretrained
return model_class.from_pretrained(
File "/root/paddlejob/workspace/env_run/lib/python3.9/site-packages/transformers/modeling_utils.py", line 3456, in from_pretrained
config = cls._autoset_attn_implementation(
File "/root/paddlejob/workspace/env_run/lib/python3.9/site-packages/transformers/modeling_utils.py", line 1302, in _autoset_attn_implementation
cls._check_and_enable_flash_attn_2(
File "/root/paddlejob/workspace/env_run/lib/python3.9/site-packages/transformers/modeling_utils.py", line 1382, in _check_and_enable_flash_attn_2
raise ValueError(
ValueError: LlamaForCausalLM does not support Flash Attention 2.0 yet. Please open an issue on GitHub to request support for this architecture: https://github.com/huggingface/transformers/issues/new

@ArthurZucker
Copy link
Collaborator

make sure you are using the latest version of transformers and not a code on the hub! 🤗 closing as this was added in #25598

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants