-
Notifications
You must be signed in to change notification settings - Fork 591
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor(meta): clarify the completeness of internal table catalogs #18944
Conversation
55de07d
to
eb9552d
Compare
This stack of pull requests is managed by Graphite. Learn more about stacking. Join @BugenZhao and the rest of your teammates on Graphite |
Signed-off-by: Bugen Zhao <[email protected]> only notify internal tables Signed-off-by: Bugen Zhao <[email protected]> add more comments Signed-off-by: Bugen Zhao <[email protected]>
0c7b344
to
b365566
Compare
Signed-off-by: Bugen Zhao <[email protected]>
Signed-off-by: Bugen Zhao <[email protected]>
Signed-off-by: Bugen Zhao <[email protected]>
// so we need to notify the frontend to delete them here. | ||
// The frontend will ignore the request if the object does not exist, | ||
// so it's safe to always notify. | ||
self.notify_frontend(Operation::Delete, build_relation_group_for_delete(objs)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As the notification now goes after successfully scheduling and building of the job, a failure here does not always mean that the notification was sent. That's why we changed all DROP
handlers in the frontend to be DROP .. IF EXISTS
.
An alternative could be introducing a new operation like Cleanup
for this purposes, ensuring Delete
remains strictly applied. But it seems too ad-hoc to me as there's no much other usage of it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can there be a race condition? Whereby we have Delete
of some catalog followed by Add
of that catalog?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think not because try_abort_creating_streaming_job
is called only after create_streaming_job_inner
has returned with error, where Add
is either pushed into the notification queue or not issued at all. After entering this try_abort
function, no further Add
operations will occur.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤔 We ensured that CREATE
and DROP
will be properly paired in #18476 .Even if the CREATE
notification fails to send due to either a FE or meta node reboot or a broken connection between them, the creation of the mview and its internal table catalogs will still be synchronized through an initial snapshot.
Oops, I realized that we'll eventually notify the frontend nodes to complete the catalogs for both the job and the internal tables in risingwave/src/meta/src/controller/streaming_job.rs Lines 771 to 797 in 265a7ac
So this PR is more like a refactor rather than a fix. Actually the main motivation of this PR is to enable #18976, where we don't expect the frontend nodes to receive a However, I think it's also okay to bypass the check and allow |
Signed-off-by: Bugen Zhao <[email protected]>
a543ff2
to
5adb7fe
Compare
Have updated the PR to only
|
@@ -1643,6 +1645,7 @@ impl DdlController { | |||
table_parallelism, | |||
max_parallelism.get(), | |||
); | |||
let internal_tables = table_fragments.internal_tables(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The internal_tables
field in CreateStreamingJobContext
will now be complete.
I hereby agree to the terms of the RisingWave Labs, Inc. Contributor License Agreement.
What's changed and what's your intention?
When creating a streaming job, the internal tables directly scraped from the frontend's
StreamFragmentGraph
are incomplete, because several fields likefragment_id
orvnode_count
are not filled yet and will be determined later.Previously, we notified the frontend nodes with these catalogs but did not update them afterward. This will cause problems since variable vnode count support is introduced, as the vnode count of a table is a significant property when scheduling a batch scan over it.This PR changes to notify the internal table catalogs only after aTableFragments
is build, where all information is final and complete.This PR revises the documentation, comments, and naming of related snippets to clarify the completeness of internal table catalogs during various phases of creating a streaming job.
Checklist
./risedev check
(or alias,./risedev c
)Documentation
Release note
If this PR includes changes that directly affect users or other significant modifications relevant to the community, kindly draft a release note to provide a concise summary of these changes. Please prioritize highlighting the impact these changes will have on users.