Maybe the throughput between TGI and vLLM should be updated #478
Closed
zhaoyang-star
started this conversation in
General
Replies: 1 comment
-
Hi! We are testing the performance of TGI and will update the performance results afterward. You can track the progress at #381 |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I just notice that TGI has support Paged Attention by integrating vLLM in TGI. The PR#516 has merged and be available in TGI v0.9.2.
Beta Was this translation helpful? Give feedback.
All reactions