Fix flash attention GQA bug to use the dynamic size of the key/value tensors - used for eval/inference · mosaicml/llm-foundry@05e0062