-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
release-24.1: sql/stats: fix panic catching in the stats cache, show histogram #136040
Conversation
8a626bc
to
4846dc8
Compare
Thanks for opening a backport. Please check the backport criteria before merging:
If your backport adds new functionality, please ensure that the following additional criteria are satisfied:
Also, please add a brief release justification to the body of your PR to justify this |
It looks like your PR touches production code but doesn't add or edit any test code. Did you consider adding tests to your PR? 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 2 of 2 files at r1, all commit messages.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @rytaft)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 2 of 2 files at r1, all commit messages.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @blathers-crl[bot] and @rytaft)
pkg/sql/show_histogram.go
line 73 at r1 (raw file):
// manipulate locks. if ok, e := errorutil.ShouldCatch(r); ok { err = e
See my comment here: #136041 (review)
This commit removes a top-level panic catcher from GetTableStats in the stats cache, and instead adds a couple of lower level panic catchers. The problems with the top-level catcher are: - There are other entry points into the stats cache besides GetTableStats, so it missed some cases. - Because the stats cache contains shared state, condition variables, mutexes, etc, it was possible to get into a scenario where a panic would bypass important logic (such as broadcasting when a resource is available), but the panic would be caught at the top level and leave the cache in a bad state. The new panic catchers are below the level of this shared state manipulation, but also ensure that all paths into the stats cache that might panic are covered. This commit also adds panic catching to SHOW HISTOGRAM so that decoding errors do not crash the process when running SHOW HISTOGRAM. Fixes #135940 Release note (bug fix): Fixed an issue where corrupted table statistics could cause the cockroach process to crash.
4846dc8
to
51a5f2b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @mgartner and @michae2)
pkg/sql/show_histogram.go
line 73 at r1 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
See my comment here: #136041 (review)
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 1 files at r2, all commit messages.
Reviewable status: complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @blathers-crl[bot])
Backport 1/1 commits from #135944 on behalf of @rytaft.
/cc @cockroachdb/release
This commit removes a top-level panic catcher from
GetTableStats
in the statscache, and instead adds a couple of lower level panic catchers. The problems
with the top-level catcher are:
GetTableStats
, soit missed some cases.
etc, it was possible to get into a scenario where a panic would bypass
important logic (such as broadcasting when a resource is available), but the
panic would be caught at the top level and leave the cache in a bad state.
The new panic catchers are below the level of this shared state manipulation,
but also ensure that all paths into the stats cache that might panic are
covered.
This commit also adds panic catching to
SHOW HISTOGRAM
so that decoding errorsdo not crash the process when running
SHOW HISTOGRAM
.Fixes #135940
Release note (bug fix): Fixed an issue where corrupted table statistics could cause the cockroach process to crash.
Release justification: