Max prompt tokens/sequence length limit in vllm core scheduler #446
-
I noticed that the following block (https://github.com/vllm-project/vllm/blob/main/vllm/core/scheduler.py#L193) was added to vllm core scheduler if num_prompt_tokens >= self.scheduler_config.max_seq_len:
logger.warning(
f"Input prompt ({num_prompt_tokens} tokens) is too long"
" and exceeds limit of "
f"{self.scheduler_config.max_seq_len}")
for seq in seq_group.get_seqs():
seq.status = SequenceStatus.FINISHED_IGNORED
ignored_seq_groups.append(seq_group)
self.waiting.pop(0)
break as a fix for the issue #113 I wonder why we're not using Thank you for your instruction! |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments
-
Hi we just fixed this in the latest main. Please retry. |
Beta Was this translation helpful? Give feedback.
-
WARNING 07-28 03:23:18 scheduler.py:196] Input prompt (2716 tokens) is too long and exceeds limit of 4096 |
Beta Was this translation helpful? Give feedback.
Hi we just fixed this in the latest main. Please retry.