-
Notifications
You must be signed in to change notification settings - Fork 591
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: don't panic and trigger recovery when applying cancel command for created job #19291
Conversation
// Otherwise our persisted state is dirty. | ||
let mut table_ids = table_fragments.internal_table_ids(); | ||
table_ids.push(table_id); | ||
mgr.catalog_manager.assert_tables_deleted(table_ids).await; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The error returned by cancel_create_materialized_view_procedure
only occurs in situations where: 1. Writing to the metastore fails; 2. The job has already been successfully created. When the job has been successfully created, this assertion will cause a meta panic. Because the cancel command has already stopped the actor on CNs, here directly throw an error to let recovery rebuild.
// It won't clean the tables on failure, | ||
// since the failure could be recoverable. | ||
// As such it needs to be handled here. | ||
self.barrier_manager_context |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the job is already created, unregistering from hummock will lead to data inconsistency. Meta will crash loop during commit epoch because of missing state table id. We only do unregister after the catalog is successfully deleted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this change needed in main/2.1?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, the behavior already changed for SQL backend in main/2.1.
Graphite Automations"release branch request review" took an action on this PR • (11/07/24)1 reviewer was added to this PR based on xxchan's automation. |
I hereby agree to the terms of the RisingWave Labs, Inc. Contributor License Agreement.
What's changed and what's your intention?
Checklist
./risedev check
(or alias,./risedev c
)Documentation
Release note
If this PR includes changes that directly affect users or other significant modifications relevant to the community, kindly draft a release note to provide a concise summary of these changes. Please prioritize highlighting the impact these changes will have on users.