-
Notifications
You must be signed in to change notification settings - Fork 590
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support separate replicate executor from arrangement backfill #13207
Comments
If we separate the replicate executor from the arrangement backfill, we have a chance to share the no-shuffle backfill executor between these two scenarios. That is the benefit. My concern is that if we separate them, we will also separate the |
What will be the development effort required? That's a good point, I didn't think of it.
Oh I see, like replicate executor subscribes to the backfill manager. Then when receive notification to stop, it will stop? +1 for it, no need to couple it with barrier is best. |
I think we need to provide a way to let different executors refer to the same state table so that they can write and read the same data.
Yes. Don't rely on barrier is best. Using channel is possible, since these executors are in the same CN. |
Is that not currently possible? 🤔 I'm thinking that the replicated state table can already be used as is.
Oh so this
You're right. Backfill manager may not even be necessary here. I'm thinking if there's some way to directly create a channel between the executors. When CN builds the actors I suppose that's when it should instantiate the channel. |
Oh I guess there could be some in-memory state not shared. Separate Read / Write paths should not be an issue, LocalStateStore uses a r/w lock for its read version. |
I mean use |
This issue has been open for 60 days with no activity. If you think it is still relevant today, and needs to be done in the near future, you can comment to update the status, or just manually remove the You can also confidently close this issue as not planned to keep our backlog clean. |
Background of Replication
Replication is required for arrangement backfill. See https://www.notion.so/risingwave-labs/Arrangement-Backfill-31d4d0044726451c9cc77dc84a0f8bb2#5fc32752d4934aaaab6f739519169a42
pos = 3
onwards. This meanspos < 3
has been backfilled.pos >= 3
, it will not have been processed in updates, since updates would only processpos < 3
for backfill, i.e. updates on backfilled data. Backfill expects to read this uncommitted data via snapshot read.pos >= 3
is lost.Motivations
Currently the replication logic is coupled with arrangement backfill.
The main thing is that we also make arrangement backfill much more maintainable. Now it could potentially just reuse the
no_shuffle_backfill
executor. Only the logic from the dispatcher needs to be changed to handle the scaling aspect.If we decouple it, we could potentially share the state table associated with the replicate executor to do non-checkpoint reads for other executors (not sure about precise usecase yet).
Issues
Stopping Replication when backfill complete
Previously, the key issue is that replicate executor needs to stop replication when backfill is complete to avoid consuming memory. However, meta can see that once backfill is complete, it should schedule a config change, which can stop corresponding replicate executor from replicating, if there's no other dependencies on it.
Then replicate executor should just forward updates downstream.
No Shuffle Backfill
No shuffle backfill needs to support vnode level state to be compatible with arrangement backfill. We can use separate executors first, since logic is implemented in arrangement backfill.
Compatibility with the table being replicated (target table)
This is actually independent since this same issue affects replication logic, whether it is part of arrangement backfill or not. I'm just including it here for clarity.
Replicate will maintain its shared buffer separately from the target table.
The data in the shared buffer needs to follow the same schema as the table being replicated, due to correctness, when the replicated data (from replicated table) + committed data (from target table) gets merged.
The backfilling stream job could select a subset of columns. In that case, replicate will need to fill in
NULL
for missing columns.The text was updated successfully, but these errors were encountered: