defensive: verify permit channel configuration on start-up to avoid potential stuck #13475
Labels
component/streaming
Stream processing related issue.
no-issue-activity
type/enhancement
Improvements to existing implementation.
In #6170, we introduced permit-based back-pressure to resolve the imbalance throughput between local exchanges and remote exchanges. As an optimization, the message for giving back the permits will be batched before sending back to the upstream.
risingwave/src/stream/src/executor/exchange/input.rs
Lines 171 to 182 in e25ebd3
On the other side, if a chunk is too large, it will clamp its required permit to
total - batched
. Thebatched
is subtracted to prevent deadlock caused by permit batching.risingwave/src/stream/src/executor/exchange/permit.rs
Line 47 in e25ebd3
Here comes the problem: if the upstream and the downstream does not reach consensus on these configuration, the behavior might be unexpected and there might be potential stuck. While it is highly unlikely, it is still possible if different compute nodes utilize varying configurations.
The text was updated successfully, but these errors were encountered: