Skip to content

Commit

Permalink
remove max_num_batched_tokens
Browse files Browse the repository at this point in the history
  • Loading branch information
lynnleelhl committed Nov 9, 2023
1 parent 79809b6 commit 8d62843
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion addons/llm/templates/scripts.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -54,7 +54,7 @@ data:
sleep 1
continue
fi
python -m vllm.entrypoints.api_server --host 0.0.0.0 --port 8000 --model ${MODEL_NAME} --gpu-memory-utilization 0.95 --max-num-seqs 512 --max-num-batched-tokens 8192 --tensor-parallel-size ${KB_VLLM_N} ${EXTRA_ARGS} 2>&1 > log
python -m vllm.entrypoints.api_server --host 0.0.0.0 --port 8000 --model ${MODEL_NAME} --gpu-memory-utilization 0.95 --max-num-seqs 512 --tensor-parallel-size ${KB_VLLM_N} ${EXTRA_ARGS} 2>&1 > log
code=$?
if [ $code -eq 0 ]; then
break
Expand Down

0 comments on commit 8d62843

Please sign in to comment.