Disable triton FA #323

hliuca · 2024-12-12T19:25:46Z

The CK version has better performance and more features.

gshtras · 2024-12-12T19:31:52Z

Could you provide some numbers to back this claim? Namely, models with/without MoE, and vision models on Instinct MI 300, 308, 325, and Navi 3,4
In our experience triton is generally better and more robust for the general case.

hliuca · 2024-12-12T19:37:45Z

It seems "export VLLM_USE_TRITON_FLASH_ATTN=0" is used everywhere... and the doc here, https://github.com/powderluv/vllm-docs

If it is better to set Triton as default, we can close this PR. Thank you.

disable triton FA by default

1f947b5

hliuca requested a review from gshtras December 12, 2024 19:27

hliuca closed this Dec 12, 2024

hliuca deleted the disable_triton_fa branch December 12, 2024 20:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Disable triton FA #323

Disable triton FA #323

hliuca commented Dec 12, 2024 •

edited by github-actions bot

Loading

gshtras commented Dec 12, 2024

hliuca commented Dec 12, 2024

Disable triton FA #323

Disable triton FA #323

Conversation

hliuca commented Dec 12, 2024 • edited by github-actions bot Loading

gshtras commented Dec 12, 2024

hliuca commented Dec 12, 2024

hliuca commented Dec 12, 2024 •

edited by github-actions bot

Loading