-
Notifications
You must be signed in to change notification settings - Fork 590
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Problem of online scaling for source backfill #18300
Comments
I think there's no special code change needed to disallow online scaling for blocking DDL on source. The existing mechanism is to grab
|
Does this problem exist for a normal |
No, because |
This comment was marked as resolved.
This comment was marked as resolved.
See this PR for more #18112 |
Problem
Currently,
SourceBackfill
has a hacky stuff: It needs toscan()
the whole state table, including splits written by other actors.risingwave/src/stream/src/executor/source/source_backfill_executor.rs
Lines 627 to 642 in 29db1d9
The reason is to handle split migration (i.e., online scaling).
SourceBackfill
has 2 stages (backfill -> forward upstream). After it entered stage 2, it cannot go back to stage 1. So if an unfinished backfill work is migrated to an actor in stage 2, it cannot do backfilling.However, the hack doesn't work correctly now, as shown in #18033 (comment). It's because at the beginning, the actor will only read the state written by itself. It needs to wait until it can read all actors' written data. i.e., wait for the first checkpoint has been available.
Note1: checkpoint is async, so we cannot rely on sth like "wait for N barriers to pass".
Note2: an actor doesn't know the total number of splits, so we cannot rely on the condition like "states.len() == num_splits". This is unlike
WatermarkFilter
executor, which has a similar hack, but it relies on "all vnodes are written". However, source splits are not distributed by vnodes.Solution
There are several solutions to fix this bug:
Patch the hack: Wait until the "inited" state. We can use
try_wait_epoch
.Rewrite the code to allow transition between stage 1 and 2: I don't think there's any technical restrictions to prevent us doing this. The reason why we didn't do it in the first place might just be an overlook. (But single direction transition looks slightly more natural though).
Disallow online scaling: Then the problem will disappear! This is inspired by @BugenZhao: MV backfilling actually is in a similar situation. It also goes from stage 1 to 2 in one direction. And currently we disallow online scaling for it. So we can do the same thing for Source backfilling.
More precisely,
More on this: https://risingwave-labs.slack.com/archives/C05AZUR6Q5P/p1724227403344349
I prefer solution 3 because it can remove the hack and make the logic simpler to understand. In the long term, if we want to allow online scaling for foreground DDL, solution 2 might be the best (both for Source and MV). But it will need larger effort to implement.
edit: Now we did both 1 & 3. 3 is automatically done after we do blocking DDL, because of
reschedule_lock
. But we found it's not enough and still need 1. See #18112edit: Imagine we don't have blocking DDL and don't track progress, it seems there's no way to disallow online scaling. So either 1 or 2 must be needed.
A little more
Relationship of the split migration problem with #18338 (blocking DDL) and #18299 (Finish backfill faster):
Blocking DDL requires finishing backfill faster for better UX. And if we finish backfill faster, the chances we need to handle split migration is lower..
Previously we only considered background backfill, and the backfill can be blocked for a long time, and we have to handle split migration gracefully. (But we forgot about the idea of rebuilding actors.)
The text was updated successfully, but these errors were encountered: