Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

defensive: verify permit channel configuration on start-up to avoid potential stuck #13475

Open
BugenZhao opened this issue Nov 16, 2023 · 1 comment
Labels
component/streaming Stream processing related issue. no-issue-activity type/enhancement Improvements to existing implementation.

Comments

@BugenZhao
Copy link
Member

BugenZhao commented Nov 16, 2023

In #6170, we introduced permit-based back-pressure to resolve the imbalance throughput between local exchanges and remote exchanges. As an optimization, the message for giving back the permits will be batched before sending back to the upstream.

if let Some(add_back_permits) = match permits.unwrap().value {
// For records, batch the permits we received to reduce the backward
// `AddPermits` messages.
Some(permits::Value::Record(p)) => {
batched_permits_accumulated += p;
if batched_permits_accumulated >= batched_permits_limit as u32 {
let permits = std::mem::take(&mut batched_permits_accumulated);
Some(permits::Value::Record(permits))
} else {
None
}
}

On the other side, if a chunk is too large, it will clamp its required permit to total - batched. The batched is subtracted to prevent deadlock caused by permit batching.

let max_chunk_permits: usize = initial_permits - batched_permits;

Here comes the problem: if the upstream and the downstream does not reach consensus on these configuration, the behavior might be unexpected and there might be potential stuck. While it is highly unlikely, it is still possible if different compute nodes utilize varying configurations.

@BugenZhao BugenZhao added type/bug Something isn't working component/streaming Stream processing related issue. labels Nov 16, 2023
@github-actions github-actions bot added this to the release-1.5 milestone Nov 16, 2023
@BugenZhao BugenZhao added type/enhancement Improvements to existing implementation. and removed type/bug Something isn't working labels Nov 16, 2023
@BugenZhao BugenZhao removed this from the release-1.5 milestone Dec 6, 2023
Copy link
Contributor

github-actions bot commented Feb 6, 2024

This issue has been open for 60 days with no activity. Could you please update the status? Feel free to continue discussion or close as not planned.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component/streaming Stream processing related issue. no-issue-activity type/enhancement Improvements to existing implementation.
Projects
None yet
Development

No branches or pull requests

1 participant