Max prompt tokens/sequence length limit in vllm core scheduler #446

yuanheng-zhao · 2023-07-12T14:32:42Z

yuanheng-zhao
Jul 12, 2023

I noticed that the following block (https://github.com/vllm-project/vllm/blob/main/vllm/core/scheduler.py#L193) was added to vllm core scheduler

if num_prompt_tokens >= self.scheduler_config.max_seq_len:
    logger.warning(
        f"Input prompt ({num_prompt_tokens} tokens) is too long"
        " and exceeds limit of "
        f"{self.scheduler_config.max_seq_len}")
    for seq in seq_group.get_seqs():
        seq.status = SequenceStatus.FINISHED_IGNORED
    ignored_seq_groups.append(seq_group)
    self.waiting.pop(0)
    break

as a fix for the issue #113

I wonder why we're not using num_prompt_tokens > self.scheduler_config.max_seq_len as a condition? It seems to filter out input sequences with exact length (e.g. --input-len 1024) for benchmarks.

Thank you for your instruction!

Answered by zhuohan123

Jul 18, 2023

Hi we just fixed this in the latest main. Please retry.

View full answer

zhuohan123 · 2023-07-18T06:21:22Z

zhuohan123
Jul 18, 2023
Maintainer

Hi we just fixed this in the latest main. Please retry.

0 replies

HarrisonBT · 2023-07-28T03:28:28Z

HarrisonBT
Jul 28, 2023

WARNING 07-28 03:23:18 scheduler.py:196] Input prompt (2716 tokens) is too long and exceeds limit of 4096

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Max prompt tokens/sequence length limit in vllm core scheduler #446

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 2 comments

{{title}}

{{title}}

Select a reply

Max prompt tokens/sequence length limit in vllm core scheduler #446

yuanheng-zhao Jul 12, 2023

Replies: 2 comments

zhuohan123 Jul 18, 2023 Maintainer

HarrisonBT Jul 28, 2023

yuanheng-zhao
Jul 12, 2023

zhuohan123
Jul 18, 2023
Maintainer

HarrisonBT
Jul 28, 2023