Support throttling an already created source/table #12997

hzxa21 · 2023-10-23T06:16:14Z

Currently user can use SET RW_STREAMING_RATE_LIMIT = <rate_limit_per_actor> to rate limit source/table created in the session but we don't have a way to throttle a specific source/table if it is not created with SET RW_STREAMING_RATE_LIMIT.

This is useful when there are data accumulated in the source and processing these data in one barrier without throttling will cause endless OOM.

The text was updated successfully, but these errors were encountered:

hzxa21 · 2023-10-23T06:16:35Z

cc @tabVersion @fuyufjh @chenzl25 @kwannoel

hzxa21 · 2023-10-23T06:17:53Z

@tabVersion and I was thinking whether we can easily support throttling an already created source by introducing a config to limit "number of messages allowed" per barrier in source. It seems that this needs a bigger refactoring and I think that we can adopt the existing FlowControlExeuctor approach.

fuyufjh · 2023-10-23T06:20:42Z

Agree.

In my mind, the user-interface is via altering a property of materialized view. Previsouly discussed in #11929

chenzl25 · 2023-10-23T07:14:26Z

Technically, we can modify the rate_limit property in the ChainNode of the Mv and restart the cluster to make it work. That is the simplest way.

tabVersion · 2023-10-23T07:53:25Z

Technically, we can modify the rate_limit property in the ChainNode of the Mv and restart the cluster to make it work. That is the simplest way.

~~I think making it an individual executor will give us more freedom placing it anywhere in the graph if we want to make some quick fix. I am drafting a rfc explaining how what the expected behavior is.~~

fuyufjh · 2023-10-23T07:55:21Z

Technically, we can modify the rate_limit property in the ChainNode of the Mv and restart the cluster to make it work. That is the simplest way.

I think making it an individual executor will give us more freedom placing it anywhere in the graph if we want to make some quick fix. I am drafting a rfc explaining how what the expected behavior is.

FYI. This part is perhaps done in #11919 and #12295

tabVersion · 2023-10-23T08:51:49Z

Technically, we can modify the rate_limit property in the ChainNode of the Mv and restart the cluster to make it work. That is the simplest way.

I think making it an individual executor will give us more freedom placing it anywhere in the graph if we want to make some quick fix. I am drafting a rfc explaining how what the expected behavior is.

FYI. This part is perhaps done in #11919 and #12295

That's true. The work left is on the ChainNode and the frontend, giving more ways to modify the control rate.

BugenZhao · 2023-10-24T04:03:36Z

I'm unsure if it's really necessary to make it so "general", that is, utilizing the Alter MV approach to achieve the rate modification. From my prospective, we may simply wrap each executor with a RateLimitExecutor without the awareness in the persisted stream graph. A (developer) system parameter will encode rate limit settings for all actors (if present), which is subscribed by those RateLimitExecutor.

tabVersion · 2023-10-25T05:05:13Z

I'm unsure if it's really necessary to make it so "general", that is, utilizing the Alter MV approach to achieve the rate modification. From my prospective, we may simply wrap each executor with a RateLimitExecutor without the awareness in the persisted stream graph. A (developer) system parameter will encode rate limit settings for all actors (if present), which is subscribed by those RateLimitExecutor.

I don't think it is necessary in most cases but it is essential to reduce the data into the cluster, preventing from OOM loop.
let's make it a dev tool, a risectl func, rather than a common prop in the first version.

kwannoel · 2023-10-25T13:59:08Z

Hi, seems @tabVersion is working on this issue. Could you please share more of the overall direction you will take?

BugenZhao · 2023-10-26T05:49:03Z

Could you please share more of the overall direction you will take?

+1. So what's the eventual plan for updating the flow control rate online?

tabVersion · 2023-10-31T06:37:18Z

Could you please share more of the overall direction you will take?

+1. So what's the eventual plan for updating the flow control rate online?

I am proposing a config change solution, risectl will send a Throttle Command to apply throttle args on given table_idorsource_id related actors. PR #13166

tabVersion · 2023-11-01T07:51:49Z

As FlowControl Executor does not require an additional actor_id, I think all Source exec (including fetch exec) and Chain exec will get the following FlowControl exec after #13057.
Thus, the throttle config change proposed in #13166 is also valid for existing tables(and requires an upgrade)

kwannoel · 2023-11-02T07:41:04Z

I'm unsure if it's really necessary to make it so "general", that is, utilizing the Alter MV approach to achieve the rate modification. From my prospective, we may simply wrap each executor with a RateLimitExecutor without the awareness in the persisted stream graph. A (developer) system parameter will encode rate limit settings for all actors (if present), which is subscribed by those RateLimitExecutor.

I'm thinking if we can let the system parameter be granular enough. Seems kind of weird, because system parameter is system-wide setting.

Whereas here, we may want to specify rate limit for a specific stream job, rather than the whole stream graph. So config change seems to be more reasonable..

kwannoel · 2023-11-02T07:45:26Z

Actually I think the two approach can be compatible? Both have their usecases.

System parameter to throttle entire stream graph, if we don't know error source, and just want to make it stable ASAP.
Config change once we have narrowed down the source.

Within the flow control executor we have the following behaviour:

Some meta endpoint which can temporarily throttle throughput of entire stream graph or stop it as suggested by @yezizp2012 . I don't think we need a system variable per-se. Because system variable is persisted, and makes the entire stream graph follow some rate_limit.
Config change support as done by @tabVersion in feat: use risectl to throttle Source and Chain by changing FlowControl params via config change #13166

tabVersion · 2024-01-09T09:08:07Z

will continue to push forward after closing #14384

tabVersion · 2024-03-06T09:09:40Z

will continue moving the throttle inside source exec and backfill exec

tabVersion · 2024-05-14T09:20:17Z

#15948 close as completed

hzxa21 added the type/feature label Oct 23, 2023

github-actions bot added this to the release-1.4 milestone Oct 23, 2023

tabVersion mentioned this issue Oct 25, 2023

feat: allow overwrite stream_rate_control in with clause #13009

Merged

8 tasks

hzxa21 mentioned this issue Oct 27, 2023

Discussion: create source without consuming data until a start command #13103

Open

tabVersion mentioned this issue Oct 31, 2023

feat: use risectl to throttle Source and Chain by changing FlowControl params via config change #13166

Merged

8 tasks

fuyufjh modified the milestones: release-1.4, release-1.5 Nov 8, 2023

fuyufjh assigned tabVersion Nov 8, 2023

tabVersion modified the milestones: release-1.5, release-1.6 Dec 6, 2023

tabVersion modified the milestones: release-1.6, release-1.7 Jan 9, 2024

tabVersion removed this from the release-1.7 milestone Mar 6, 2024

tabVersion added this to the release-1.8 milestone Mar 6, 2024

tabVersion modified the milestones: release-1.8, release-1.9 Apr 8, 2024

tabVersion closed this as completed May 14, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support throttling an already created source/table #12997

Support throttling an already created source/table #12997

hzxa21 commented Oct 23, 2023

hzxa21 commented Oct 23, 2023

hzxa21 commented Oct 23, 2023

fuyufjh commented Oct 23, 2023 •

edited

Loading

chenzl25 commented Oct 23, 2023

tabVersion commented Oct 23, 2023 •

edited

Loading

fuyufjh commented Oct 23, 2023 •

edited

Loading

tabVersion commented Oct 23, 2023

BugenZhao commented Oct 24, 2023 •

edited

Loading

tabVersion commented Oct 25, 2023

kwannoel commented Oct 25, 2023

BugenZhao commented Oct 26, 2023 •

edited

Loading

tabVersion commented Oct 31, 2023

tabVersion commented Nov 1, 2023

kwannoel commented Nov 2, 2023 •

edited

Loading

kwannoel commented Nov 2, 2023 •

edited

Loading

tabVersion commented Jan 9, 2024

tabVersion commented Mar 6, 2024

tabVersion commented May 14, 2024

Support throttling an already created source/table #12997

Support throttling an already created source/table #12997

Comments

hzxa21 commented Oct 23, 2023

hzxa21 commented Oct 23, 2023

hzxa21 commented Oct 23, 2023

fuyufjh commented Oct 23, 2023 • edited Loading

chenzl25 commented Oct 23, 2023

tabVersion commented Oct 23, 2023 • edited Loading

fuyufjh commented Oct 23, 2023 • edited Loading

tabVersion commented Oct 23, 2023

BugenZhao commented Oct 24, 2023 • edited Loading

tabVersion commented Oct 25, 2023

kwannoel commented Oct 25, 2023

BugenZhao commented Oct 26, 2023 • edited Loading

tabVersion commented Oct 31, 2023

tabVersion commented Nov 1, 2023

kwannoel commented Nov 2, 2023 • edited Loading

kwannoel commented Nov 2, 2023 • edited Loading

tabVersion commented Jan 9, 2024

tabVersion commented Mar 6, 2024

tabVersion commented May 14, 2024

fuyufjh commented Oct 23, 2023 •

edited

Loading

tabVersion commented Oct 23, 2023 •

edited

Loading

fuyufjh commented Oct 23, 2023 •

edited

Loading

BugenZhao commented Oct 24, 2023 •

edited

Loading

BugenZhao commented Oct 26, 2023 •

edited

Loading

kwannoel commented Nov 2, 2023 •

edited

Loading

kwannoel commented Nov 2, 2023 •

edited

Loading