Reproduce the given benchmark results #500

Answered by zhuohan123

yeliang2258 asked this question in Q&A

yeliang2258
Jul 17, 2023

Hello，I would like to know which model and test tool are used for the performance data provided on the homepage?

Is the model available on https://huggingface.co/?

Answered by zhuohan123

We use this script to get the benchmark results: https://github.com/vllm-project/vllm/blob/main/benchmarks/benchmark_serving.py

View full answer

Replies: 5 comments

yeliang2258
Jul 17, 2023
Author

lmsys/vicuna-13b-v1.3, young-geng/koala, openlm-research/open_llama_13b, or others ?

0 replies

yeliang2258
Jul 18, 2023
Author

@zhuohan123 Please help me solve this problem, thank you

0 replies

zhuohan123
Jul 18, 2023
Maintainer

Hi we use the official LLaMA model (e.g., huggyllama/llama-13b). However, all these llama models you listed should perform exactly the same.

0 replies

zhuohan123
Jul 18, 2023
Maintainer

We use this script to get the benchmark results: https://github.com/vllm-project/vllm/blob/main/benchmarks/benchmark_serving.py

0 replies

Answer selected by yeliang2258

yeliang2258
Jul 19, 2023
Author

https://github.com/vllm-project/vllm/blob/main/benchmarks/launch_tgi_server.sh
In addition, I would like to ask, in this script, what is the TOKENS setting?

0 replies

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment