forked from Dao-AILab/flash-attention
-
Notifications
You must be signed in to change notification settings - Fork 48
Pull requests: ROCm/flash-attention
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[Do not merge] vllm layout varlen
WIP
work in progress
#106
opened Dec 3, 2024 by
rocking5566
Loading…
Added Benchmark for Rotary Decode Kernel + Performance Speed Up for Rotary Kernel
#102
opened Nov 22, 2024 by
alexkranias-amd
Loading…
GPUAI-1250 - Flash Attention v2.04 two modules layer_norm cannot be used fixed
#52
opened Apr 3, 2024 by
xiaoxiangAMD
Loading…
GPUAI-1250 - Flash Attention v2.04 module rotary cannot be used code fixed
#47
opened Mar 1, 2024 by
xiaoxiangAMD
Loading…
ProTip!
Follow long discussions with comments:>50.