FA2 and rotary positional embeddings #16

benjamin-kroeger · 2024-12-07T08:08:15Z

Hey thanks for building this framework it is exactly what I need for my project but I was wondering, whether there is a particular reason for why flash attention 2 and rotary positional embeddings were discarded from the standard Llama implementation?

SeanLee97 · 2024-12-12T02:46:12Z

hi @benjamin-kroeger, it might be because this code is based on an earlier transformers version. I will upgrade it to be compatible with the latest transformers when I am free. PRs are also welcome!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FA2 and rotary positional embeddings #16

FA2 and rotary positional embeddings #16

benjamin-kroeger commented Dec 7, 2024 •

edited

Loading

SeanLee97 commented Dec 12, 2024

FA2 and rotary positional embeddings #16

FA2 and rotary positional embeddings #16

Comments

benjamin-kroeger commented Dec 7, 2024 • edited Loading

SeanLee97 commented Dec 12, 2024

benjamin-kroeger commented Dec 7, 2024 •

edited

Loading