Skip to content

Fix flash attention GQA bug to use the dynamic size of the key/value tensors - used for eval/inference #3109

Fix flash attention GQA bug to use the dynamic size of the key/value tensors - used for eval/inference

Fix flash attention GQA bug to use the dynamic size of the key/value tensors - used for eval/inference #3109

The logs for this run have expired and are no longer available.