Releases: feifeibear/long-context-attention
Releases · feifeibear/long-context-attention
Version 0.4.2 is released on 19th Nov 2024
What's Changed
- ulysses in benchmark by @feifeibear in #104
- flash_attn3 not directly import by @feifeibear in #105
- version to 0.4.2 by @feifeibear in #106
Full Changelog: 0.4.1...0.4.2
0.4.1 is released on Nov 15th 2024
What's Changed
- feat: add use_sync switch to ulysses by @Eigensystem in #103
Full Changelog: 0.4.0...0.4.1
0.4.0 version is released on Nov 15th 2024
The 0.4.0 version starts to support FlashAttention V3 on Hopper GPUs. This version also has been tested on low-memory (24GB) GPUs.
What's Changed
- update all_to_all by @Lay2000 in #87
- upgrade version to 0.3.6 by @feifeibear in #89
- prevent dispatch issues in SequenceParallel to improve performance by @uclalch in #92
- sync after all2all for device memory saving by @feifeibear in #93
- version 0.3.7 by @feifeibear in #94
- flash attention 3 by @feifeibear in #95
- FA3: update to the latest FA3 API. by @feifeibear in #96
- add dit benchmark script by @feifeibear in #97
- 1114v2 by @feifeibear in #98
- update readme for FA3 by @feifeibear in #99
- version to 0.4.0 by @feifeibear in #100
- dump version to 0.4.1 by @feifeibear in #101
- dump to 0.4.0 by @feifeibear in #102
New Contributors
Full Changelog: 0.3.5...0.4.0
0.3.5
What's Changed
- revert version 0.3.2 by @feifeibear in #83
- version 0.3.5 by @feifeibear in #84
Full Changelog: 0.3.3...0.3.5
0.3.2 released
What's Changed
- remove amd installation to an individual doc by @feifeibear in #76
- auto publish python package when release on github by @feifeibear in #77
- version 0.3.2 by @feifeibear in #78
- remove useless workflow by @feifeibear in #79
- version to 0.3.2 by @feifeibear in #80
- polish publish workflow by @feifeibear in #81
Full Changelog: v0.3.1...0.3.2
v0.3.1 released at 2024.09.14
stripe_extract_local, basic_extract_local, zigzag_extract_local works for tensors dimension >=2.
v0.3 released on 27th August 2024!
upgrade flash_attn >= 2.6.0
v0.2 released on 24th June 2024!
- Ulysses supports T4 and V100.
- Updates some directory structures.
v0.1
Sequence parallel attention adopting a hybrid ulysses and ring attention approach.
Support GQA
Support QKV packed.