Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Continuous batching] In the event of OOM, return tokens generated so far for the request #661

Merged
merged 1 commit into from
Jul 24, 2024

Conversation

mzegla
Copy link
Collaborator

@mzegla mzegla commented Jul 22, 2024

No description provided.

@mzegla mzegla requested review from Wovchena, iefode and popovaan July 23, 2024 10:32
@mzegla mzegla added this pull request to the merge queue Jul 24, 2024
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Jul 24, 2024
@Wovchena Wovchena added this pull request to the merge queue Jul 24, 2024
Merged via the queue into openvinotoolkit:master with commit 42dd049 Jul 24, 2024
27 checks passed
mzegla added a commit to mzegla/openvino.genai that referenced this pull request Jul 24, 2024
@ilya-lavrenov ilya-lavrenov self-assigned this Jul 31, 2024
for (auto& sequence: m_sequences) {
GenerationOutput output;
output.generated_token_ids = sequence->get_generated_ids();
output.score = sequence->get_beam_search_score(m_sampling_params);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

beam search score can be used only for beam search, while this method push_outputs is used for greedy and multinomial as well. For non-beam search we need to use cumulative logprob.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, I'm already fixing that in my next PR

for (auto& sequence: finished_sequences) {
GenerationOutput output;
output.generated_token_ids = sequence->get_generated_ids();
output.score = sequence->get_cumulative_log_probs();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

currently logprobs are not used, but should be

@mzegla mzegla deleted the oom_returns branch January 21, 2025 09:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants