Skip to content

Fix flash attention GQA bug to use the dynamic size of the key/value tensors - used for eval/inference #3109

Fix flash attention GQA bug to use the dynamic size of the key/value tensors - used for eval/inference

Fix flash attention GQA bug to use the dynamic size of the key/value tensors - used for eval/inference #3109

Triggered via pull request November 21, 2023 23:12
Status Success
Total duration 8m 41s
Artifacts 2

pr-cpu.yaml

on: pull_request
Matrix: pytest-cpu
Coverage Results  /  coverage
9s
Coverage Results / coverage
Fit to window
Zoom out
Zoom in

Artifacts

Produced during runtime
Name Size
coverage-cd54471d459d7f93f632e0693f391463850f0822-cpu-1.13.1 Expired
268 KB
coverage-cd54471d459d7f93f632e0693f391463850f0822-cpu-2.1.0 Expired
268 KB