Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support throttling an already created source/table #12997

Closed
hzxa21 opened this issue Oct 23, 2023 · 18 comments
Closed

Support throttling an already created source/table #12997

hzxa21 opened this issue Oct 23, 2023 · 18 comments
Assignees
Milestone

Comments

@hzxa21
Copy link
Collaborator

hzxa21 commented Oct 23, 2023

Currently user can use SET RW_STREAMING_RATE_LIMIT = <rate_limit_per_actor> to rate limit source/table created in the session but we don't have a way to throttle a specific source/table if it is not created with SET RW_STREAMING_RATE_LIMIT.

This is useful when there are data accumulated in the source and processing these data in one barrier without throttling will cause endless OOM.

@github-actions github-actions bot added this to the release-1.4 milestone Oct 23, 2023
@hzxa21
Copy link
Collaborator Author

hzxa21 commented Oct 23, 2023

@hzxa21
Copy link
Collaborator Author

hzxa21 commented Oct 23, 2023

@tabVersion and I was thinking whether we can easily support throttling an already created source by introducing a config to limit "number of messages allowed" per barrier in source. It seems that this needs a bigger refactoring and I think that we can adopt the existing FlowControlExeuctor approach.

@fuyufjh
Copy link
Member

fuyufjh commented Oct 23, 2023

Agree.

In my mind, the user-interface is via altering a property of materialized view. Previsouly discussed in #11929

@chenzl25
Copy link
Contributor

Technically, we can modify the rate_limit property in the ChainNode of the Mv and restart the cluster to make it work. That is the simplest way.

@tabVersion
Copy link
Contributor

tabVersion commented Oct 23, 2023

Technically, we can modify the rate_limit property in the ChainNode of the Mv and restart the cluster to make it work. That is the simplest way.

I think making it an individual executor will give us more freedom placing it anywhere in the graph if we want to make some quick fix. I am drafting a rfc explaining how what the expected behavior is.

@fuyufjh
Copy link
Member

fuyufjh commented Oct 23, 2023

Technically, we can modify the rate_limit property in the ChainNode of the Mv and restart the cluster to make it work. That is the simplest way.

I think making it an individual executor will give us more freedom placing it anywhere in the graph if we want to make some quick fix. I am drafting a rfc explaining how what the expected behavior is.

FYI. This part is perhaps done in #11919 and #12295

@tabVersion
Copy link
Contributor

Technically, we can modify the rate_limit property in the ChainNode of the Mv and restart the cluster to make it work. That is the simplest way.

I think making it an individual executor will give us more freedom placing it anywhere in the graph if we want to make some quick fix. I am drafting a rfc explaining how what the expected behavior is.

FYI. This part is perhaps done in #11919 and #12295

That's true. The work left is on the ChainNode and the frontend, giving more ways to modify the control rate.

@BugenZhao
Copy link
Member

BugenZhao commented Oct 24, 2023

I'm unsure if it's really necessary to make it so "general", that is, utilizing the Alter MV approach to achieve the rate modification. From my prospective, we may simply wrap each executor with a RateLimitExecutor without the awareness in the persisted stream graph. A (developer) system parameter will encode rate limit settings for all actors (if present), which is subscribed by those RateLimitExecutor.

@tabVersion
Copy link
Contributor

I'm unsure if it's really necessary to make it so "general", that is, utilizing the Alter MV approach to achieve the rate modification. From my prospective, we may simply wrap each executor with a RateLimitExecutor without the awareness in the persisted stream graph. A (developer) system parameter will encode rate limit settings for all actors (if present), which is subscribed by those RateLimitExecutor.

I don't think it is necessary in most cases but it is essential to reduce the data into the cluster, preventing from OOM loop.
let's make it a dev tool, a risectl func, rather than a common prop in the first version.

@kwannoel
Copy link
Contributor

Hi, seems @tabVersion is working on this issue. Could you please share more of the overall direction you will take?

@BugenZhao
Copy link
Member

BugenZhao commented Oct 26, 2023

Could you please share more of the overall direction you will take?

+1. So what's the eventual plan for updating the flow control rate online?

@tabVersion
Copy link
Contributor

Could you please share more of the overall direction you will take?

+1. So what's the eventual plan for updating the flow control rate online?

I am proposing a config change solution, risectl will send a Throttle Command to apply throttle args on given table_idorsource_id related actors. PR #13166

@tabVersion
Copy link
Contributor

As FlowControl Executor does not require an additional actor_id, I think all Source exec (including fetch exec) and Chain exec will get the following FlowControl exec after #13057.
Thus, the throttle config change proposed in #13166 is also valid for existing tables(and requires an upgrade)

@kwannoel
Copy link
Contributor

kwannoel commented Nov 2, 2023

I'm unsure if it's really necessary to make it so "general", that is, utilizing the Alter MV approach to achieve the rate modification. From my prospective, we may simply wrap each executor with a RateLimitExecutor without the awareness in the persisted stream graph. A (developer) system parameter will encode rate limit settings for all actors (if present), which is subscribed by those RateLimitExecutor.

I'm thinking if we can let the system parameter be granular enough. Seems kind of weird, because system parameter is system-wide setting.

Whereas here, we may want to specify rate limit for a specific stream job, rather than the whole stream graph. So config change seems to be more reasonable..

@kwannoel
Copy link
Contributor

kwannoel commented Nov 2, 2023

Actually I think the two approach can be compatible? Both have their usecases.

  • System parameter to throttle entire stream graph, if we don't know error source, and just want to make it stable ASAP.
  • Config change once we have narrowed down the source.

Within the flow control executor we have the following behaviour:

  1. Some meta endpoint which can temporarily throttle throughput of entire stream graph or stop it as suggested by @yezizp2012 . I don't think we need a system variable per-se. Because system variable is persisted, and makes the entire stream graph follow some rate_limit.
  2. Config change support as done by @tabVersion in feat: use risectl to throttle Source and Chain by changing FlowControl params via config change #13166

@tabVersion
Copy link
Contributor

will continue to push forward after closing #14384

@tabVersion tabVersion modified the milestones: release-1.6, release-1.7 Jan 9, 2024
@tabVersion
Copy link
Contributor

will continue moving the throttle inside source exec and backfill exec

@tabVersion tabVersion removed this from the release-1.7 milestone Mar 6, 2024
@tabVersion tabVersion added this to the release-1.8 milestone Mar 6, 2024
@tabVersion tabVersion modified the milestones: release-1.8, release-1.9 Apr 8, 2024
@tabVersion
Copy link
Contributor

#15948 close as completed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants