Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

source_rate_limit doesn't work for file source #19296

Closed
xxchan opened this issue Nov 7, 2024 · 6 comments · Fixed by #19338
Closed

source_rate_limit doesn't work for file source #19296

xxchan opened this issue Nov 7, 2024 · 6 comments · Fixed by #19338
Assignees
Labels
type/bug Something isn't working
Milestone

Comments

@xxchan
Copy link
Member

xxchan commented Nov 7, 2024

Describe the bug

FsFetch is not handled in update_source_rate_limit_by_source_id

Error message/log

No response

To Reproduce

Add source_rate_limit = 0 in e2e_test/source_inline/fs/posix_fs.slt, you can see data still produced

Expected behavior

No response

How did you deploy RisingWave?

No response

The version of RisingWave

No response

Additional context

No response

@xxchan xxchan added the type/bug Something isn't working label Nov 7, 2024
@github-actions github-actions bot added this to the release-2.2 milestone Nov 7, 2024
@kwannoel
Copy link
Contributor

kwannoel commented Nov 7, 2024

What's the corresponding streaming source executor for it? Is it the fetch_executor?

Edit: Yes it is.

@fuyufjh
Copy link
Member

fuyufjh commented Nov 11, 2024

Is someone working on this? cc @kwannoel @tabVersion

@kwannoel
Copy link
Contributor

kwannoel commented Nov 11, 2024

Is someone working on this? cc @kwannoel @tabVersion

I'll check with @tabVersion, no bandwidth to work on this recently.

Edit: He will manage this issue. Many thanks 🙏.

@xxchan
Copy link
Member Author

xxchan commented Nov 11, 2024

Add link for original customer message. Please notify them after the fix.

https://risingwave-community.slack.com/archives/C06LMDNJRMK/p1730981695290009?thread_ts=1730973353.631599&cid=C06LMDNJRMK

@tabVersion
Copy link
Contributor

Is someone working on this? cc @kwannoel @tabVersion

I'll check with @tabVersion, no bandwidth to work on this recently.

Edit: He will manage this issue. Many thanks 🙏.

I am doing a quick check on code. Will fix this ASAP.

@tabVersion
Copy link
Contributor

#16472 seems to forget apply the same change as StreamSource to StreamFsFetch

After the simple fix, create table ... source_rate_limit = 0 ... format ... encode ... can work but alter cannot.


Let me explain further on the behavior and why we meet this bug.

take this sql as example

dev=> CREATE TABLE diamonds (
    carat FLOAT,
    cut TEXT,
    color TEXT,
    depth FLOAT,
) WITH (
    connector = 'posix_fs',
    match_pattern = 'data*.csv',
    posix_fs.root = 'e2e_test/source_inline/fs/data',
    source_rate_limit = 0
) FORMAT PLAIN ENCODE CSV ( without_header = 'false', delimiter = ',');
CREATE_TABLE
dev=>
dev=> alter table diamonds set source_rate_limit to default ;
ALTER_SOURCE

obvervations:

  • Fs fetch exec starts with actor_id [9, 10, 11, 12] and fs list exec starts with actor_id 13 as singleton.
  • When handling alter command, meta sends Mutation to actor 13 instead of [9, 10, 11, 12]
    • run command: {3: {13: None}} stands for doing Mutation::Throttle to actor 13 and change rate_limit to None

reason:

  • FS source has a sightly different compute graph as MQ source.
    • Fs list exec in FS source takes the position of source exec in MQ source. We manually add fs fetch exec behind the fs fetch exec.
  • Meta treats the FS source and MQ source the same, so it alwasys tries to alter actor the fs list exec instead of the fetch ones.
  • So we need to tell if the source is FS source or MQ source when handling alter source_rate_limit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants