Maybe the throughput between TGI and vLLM should be updated #478

zhaoyang-star · 2023-07-17T02:40:51Z

zhaoyang-star
Jul 17, 2023

I just notice that TGI has support Paged Attention by integrating vLLM in TGI. The PR#516 has merged and be available in TGI v0.9.2.

zhuohan123 · 2023-07-17T07:06:55Z

Hi! We are testing the performance of TGI and will update the performance results afterward. You can track the progress at #381

0 replies