-
Notifications
You must be signed in to change notification settings - Fork 80
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Throttle bulkhead from the circuit breaker on state transition #247
base: mkipper/global-circuit-breaker-simple-integer
Are you sure you want to change the base?
Throttle bulkhead from the circuit breaker on state transition #247
Conversation
…imple-integer Force-enable host-based circuits
5e9b561
to
13a7241
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have you had any success with semsim to show the capacity before and after?
I'm also interested in how this behaves with a large error_timeout
. In particular, we may have capacity improvements, but that would be at the expense for a slower recovery. In the worst case scenario, where all workers had their circuit open at the same time, n - 1
workers will wait (edit: at least) an additional error_timeout
seconds after recovery. I think I got that right 🤔
void test_sliding_window() | ||
{ | ||
semian_simple_sliding_window_shared_t window; | ||
assert_le_int(sizeof(window), 4096, "window size is greater than a page"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about sysconf(_SC_PAGESIZE)
instead of 4096?
No, I don't think this works at all... |
What
This PR attempts to address #244. It throttles the number of bulkhead tickets to 1 when the circuit opens, throttles to
success_threshold
when the circuit moves tohalf_open
, and removes the throttle when the circuit finally closes.Why
When the circuit is in an open state, we still allow all the workers that can acquire tickets to attempt the transition to
half_open
. When the circuit is half open, we allow all those workers to attempt the transition toclosed
, which can sap much needed capacity during a sustained outage.Note: This PR is still a work in progress, and there are several corner cases that need to be addressed. I'm looking for a high-level review of the approach.
Issues
SEM_UNDO
works when one process applies the throttle, and another removes it. Or what happens when the process that applied the throttle terminates. Needs to be thought through more.cc: @Shopify/servcomm