-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
main, view: Pair view builder drain with its start #21909
base: master
Are you sure you want to change the base?
Conversation
The patch solved a problem related to an initialization order, but we ran into another one -- after moving the initialization of group0_service, it ended up being destroyed AFTER the CDC generation service would. Since CDC generations are accessed in `storage_service::topology_state_load()`: ``` for (const auto& gen_id : _topology_state_machine._topology.committed_cdc_generations) { rtlogger.trace("topology_state_load: process committed cdc generation {}", gen_id); co_await _cdc_gens.local().handle_cdc_generation(gen_id); ``` we started getting the following failure: ``` Service &seastar::sharded<cdc::generation_service>::local() [Service = cdc::generation_service]: Assertion `local_is_initialized()' failed. ``` We're reverting the patch to go back to a more stable version of Scylla and in the following commit, we'll solve the original issue in a more systematic way. This reverts commit 7bad837.
🔴 CI State: FAILUREBuild Details:
|
🔴 CI State: FAILURE✅ - Build Failed Tests (1/36079):Build Details:
|
Looks like #21628. Rerunning CI |
🟢 CI State: SUCCESS✅ - Build Build Details:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, I have only one comment.
db/view/view.cc
Outdated
future<> view_builder::stop() { | ||
vlogger.info("Stopping view builder"); | ||
return drain(); | ||
co_return; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: put an assertion here which checks that drain()
was indeed run (it needs to be run in order for the view builder to be properly stopped) - for example, you could check if _as.abort_requested()
is true
Before these changes, we only drained the view builder when stopping it, which was paired with its *construction*. However, since the view builder depends on group0_service, which is initialized AFTER the view builder is constructed, that could lead to segmentation faults. To amend it, we pair draining the view builder with its start, while leaving the lambda destroying it paired with its construction. This way we should have a proper initialization order, avoid bugs, and reinforce our understanding of how Scylla starts up. Fixes scylladb#20772
Version 2:
|
🔴 CI State: FAILURE✅ - Build Failed Tests (2/36177):Build Details:
|
Rerunning CI |
🔴 CI State: FAILURE✅ - Build Failed Tests (1/36344):Build Details:
|
I believe it's #15715. Retriggering CI |
🟢 CI State: SUCCESS✅ - Build Build Details:
|
This PR reverts a patch that solved a problem related to an initialization
order. Unfortunately, after merging it, we ran into another one -- after
changing the place where group0_service was initialized, it ended up
being destroyed AFTER the CDC generation service would. Since CDC
generations are accessed in
storage_service::topology_state_load()
:we started getting the following failure:
To solve the issue, we revert the changes and approach the problem
in another way: before that patch, we only drained the view builder
when stopping it, which was paired with its construction. However,
since the view builder depends on group0_service, which is initialized
AFTER the view builder is constructed, that could lead to segmentation
faults. To amend it, we pair draining the view builder with its start, while
leaving the lambda destroying it paired with its construction. This way
we should have a proper initialization order, avoid bugs, and reinforce
our understanding of how Scylla starts up.
Backport: The patch I'm reverting made it to 6.2, so we want to backport this one there too.
Fixes #20772
Fixes #21534