Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move rate limit from flow control to within source executors (snapshot read, source, dml) #15790

Closed
Tracked by #13802
kwannoel opened this issue Mar 19, 2024 · 6 comments · Fixed by #15948
Closed
Tracked by #13802
Assignees
Milestone

Comments

@kwannoel
Copy link
Contributor

No description provided.

@github-actions github-actions bot added this to the release-1.8 milestone Mar 19, 2024
@kwannoel
Copy link
Contributor Author

We can do it step by step, per executor.
Also, to make sure alter rate limit is still supported, since this rate limit occurs upstream of barrier, we need a another channel to capture alter rate limit changes.

@stdrc
Copy link
Member

stdrc commented Mar 20, 2024

Also, to make sure alter rate limit is still supported, since this rate limit occurs upstream of barrier, we need a another channel to capture alter rate limit changes.

If risingwavelabs/rfcs#81 this RFC is going to be implemented, seems this can be not a problem. Though we don't have a clear timeline for this now.

@tabVersion
Copy link
Contributor

We can do it step by step, per executor. Also, to make sure alter rate limit is still supported, since this rate limit occurs upstream of barrier, we need a another channel to capture alter rate limit changes.

I vote for the overall goal to move rate limit from flow control to within source executors, spreading data into more epochs. But can you elaborate more on "we need a another channel to capture alter rate limit changes".
I think the source exec can receive all mutations and apply them to the inner part. That's how split change and split migration work.

@StrikeW
Copy link
Contributor

StrikeW commented Apr 3, 2024

May I know the motivation of the issue? What's the limitation of wrapping source executors with the FlowControlExecutor?

@kwannoel
Copy link
Contributor Author

kwannoel commented Apr 4, 2024

May I know the motivation of the issue? What's the limitation of wrapping source executors with the FlowControlExecutor?

It stems from a specific edge case: rate limiting smaller than chunk size with flow control executor will just cause barrier latency to spike. This is because rate limiting a chunk would mean the barrier before it will be blocked, until the entire chunk is processed.

As such, we add some chunk splitting logic upstream, inside executors themselves, before the data stream gets merged with the barrier stream from meta.

Now we have the logic for rate limiting split across flow control, and each executor. Additionally, the way it works is unintuitive (and perhaps unreliable sometimes), it relies on upstream to bias to barrier side to let it bypass.

It is simpler and more intuitive to relocate the rate limit logic into executors. Now all the logic is in one place, and rate limit happens at the data source, so it is clear that it won't block barriers.

@tabVersion
Copy link
Contributor

under developing in #15948

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants