Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[rescheduling] Add mutex #188
base: stable/yoga-m3
Are you sure you want to change the base?
[rescheduling] Add mutex #188
Changes from all commits
6c75f06
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason not to use the
provisioning_status
?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So far my reasoning for not using
provisioning_status
was that it is bound to specific values. There are in fact possible values that aren't used for load balancers so far (e.g.ALLOCATED
). However we can't use that field at all due to following reason:No matter what locking mechanism we use, it has to be used by the
controller_worker
as well when updating a load balancer. It callsstatus_manager.update_status
after syncing the LBs. That function determines the status to set by looking at the current value ofprovisioning_status
. If we let the controller worker change that field, the value it was set to before would be lost. The controller worker could remember the value of that field for every load balancer, however this would not be crash-safe.It turns out there is a simple solution however: I'll use the amphora
status
field (which is bound to the same range of values asprovisioning_status
) and set it toALLOCATED
(that value is so far only used for amphora entries representing devices) while the associated load balancer is locked by either the rescheduling arbiter or the controller worker.update_status
would then reset that field (unlock the LB). It can be made crash-resistant by having workers reset the field to the value of the associated LB'sprovisioning_status
at startup, ifcompute_flavor
matches the host this worker is assigned to.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems pretty over-engineered to me, so you want to introduce locks to the status_manager as well as controller_worker? The status is considered a user facing property, nothing to act on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not overengineered, it's basically exactly what you proposed, just with all necessary considerations laid out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unless I misunderstood your initial proposal (the one with
provisioning_status
)