-
Notifications
You must be signed in to change notification settings - Fork 907
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixed cluster affinity scheduling evaluation order when scheduling is triggered via WorkloadRebalancer #5425
base: master
Are you sure you want to change the base?
Conversation
Welcome @bharathguvvala! It looks like this is your first PR to karmada-io/karmada 🎉 |
hi @bharathguvvala, really glad to see your PR submission! By the way, here may be some minor problems with the ci now, and can you please fix the ci and squash the commit records into one? thanks~ |
@chaosi-zju Sure, will do that. |
@@ -301,7 +301,7 @@ func (c *EndpointsliceDispatchController) cleanOrphanDispatchedEndpointSlice(ctx | |||
return nil | |||
} | |||
|
|||
func (c *EndpointsliceDispatchController) dispatchEndpointSlice(ctx context.Context, work *workv1alpha1.Work, mcs *networkingv1alpha1.MultiClusterService) error { | |||
func (c *EndpointsliceDispatchController) dispatchEndpointSlice(_ context.Context, work *workv1alpha1.Work, mcs *networkingv1alpha1.MultiClusterService) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hi @bharathguvvala, your this line may be submitted by mistake, ctx
is required~
func (c *EndpointsliceDispatchController) dispatchEndpointSlice(_ context.Context, work *workv1alpha1.Work, mcs *networkingv1alpha1.MultiClusterService) error { | |
func (c *EndpointsliceDispatchController) dispatchEndpointSlice(ctx context.Context, work *workv1alpha1.Work, mcs *networkingv1alpha1.MultiClusterService) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yea this seems like a mistake. will correct it.
pkg/scheduler/scheduler.go
Outdated
@@ -657,7 +661,7 @@ func (s *Scheduler) scheduleClusterResourceBindingWithClusterAffinity(crb *workv | |||
return err | |||
} | |||
|
|||
func (s *Scheduler) scheduleClusterResourceBindingWithClusterAffinities(crb *workv1alpha2.ClusterResourceBinding) error { | |||
func (s *Scheduler) scheduleClusterResourceBindingWithClusterAffinities(crb *workv1alpha2.ClusterResourceBinding, performFreshScheduling bool) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hi, adding a parameter performFreshScheduling
is indeed works, but may be not the best practice~
It just occurred to me that we have similar logic in:
karmada/pkg/scheduler/core/assignment.go
Lines 109 to 115 in 673a603
// the assignment mode is defaults to Steady to minimizes disruptions and preserves the balance across clusters. | |
expectAssignmentMode := Steady | |
// when spec.rescheduleTriggeredAt is updated, it represents a rescheduling is manually triggered by user, and the | |
// expected behavior of this action is to do a complete recalculation without referring to last scheduling results. | |
if util.RescheduleRequired(spec.RescheduleTriggeredAt, status.LastScheduledTime) { | |
expectAssignmentMode = Fresh | |
} |
may be we can also use util.RescheduleRequired(spec.RescheduleTriggeredAt, status.LastScheduledTime)
to judge whether need to reset the affinityIndex
, since now a fresh reschedule is defaults to reset the index of ClusterAffinities
, so we can do just like:
if util.RescheduleRequired(crb.Spec.RescheduleTriggeredAt, crb.Status.LastScheduledTime) {
affinityIndex = 0
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did this to keep the status quo functionality. I will change it based on your suggestion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks~
By the way, I'd appreciate it if you could rebase the latest master code and squash the commit log into a single record~
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done incorporating these changes. Currently there's only one scheduler worker which I think will continue to stay that way. If there are more then that could cause a race condition where the condition util.RescheduleRequired could evaluate to false while deciding scheduled cluster(s) after first evaluating to true .
Codecov ReportAttention: Patch coverage is
❗ Your organization needs to install the Codecov GitHub app to enable full functionality. Additional details and impacted files@@ Coverage Diff @@
## master #5425 +/- ##
==========================================
- Coverage 48.18% 48.17% -0.02%
==========================================
Files 664 664
Lines 54799 54805 +6
==========================================
- Hits 26405 26402 -3
- Misses 26680 26686 +6
- Partials 1714 1717 +3
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
/retest |
Hi, there are still two nits:
thanks~ |
0f4c0df
to
292904e
Compare
292904e
to
9e9d837
Compare
Adding label Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
9e9d837
to
30d242e
Compare
@chaosi-zju I've tried to squash the commits and sort of messed it up. Those squashed commits aren't showing up in the branch. Can we do "squash and merge" while merging this PR? |
can you try one more time by following command: git remote add upstream https://github.com/karmada-io/karmada.git
git fetch upstream
git rebase upstream/master then if you successfully rebased, you can directly push tou commit, but you may also encounter some rebase conflict, in that case you should manually resolve the conflict and execute |
Signed-off-by: Bharath Raghavendra Reddy Guvvala <[email protected]>
762f992
to
a37e17d
Compare
d17b81b
to
a37e17d
Compare
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@chaosi-zju can you check now? |
What type of PR is this?
/kind bug
What this PR does / why we need it:
As discussed in the community meet on 2024-07-09 , added the fix to evaluate the cluster affinities freshly when scheduling is triggered via WorkloadRebalancer.
Which issue(s) this PR fixes:
Fixes #5070
Progress
Special notes for your reviewer:
Raising this PR to get early feedback on the implementation as it is still work in progress and tested manually by building the deploying the scheduler. There may be unit test failures which I will fix so please don't merge the PR until that's done.
Does this PR introduce a user-facing change?:
Yes. [Rest of the section TBA]