Fixed cluster affinity scheduling evaluation order when scheduling is triggered via WorkloadRebalancer #5425

bharathguvvala · 2024-08-26T06:08:19Z

What type of PR is this?

/kind bug

What this PR does / why we need it:
As discussed in the community meet on 2024-07-09 , added the fix to evaluate the cluster affinities freshly when scheduling is triggered via WorkloadRebalancer.

Which issue(s) this PR fixes:
Fixes #5070

Progress

Special notes for your reviewer:
Raising this PR to get early feedback on the implementation as it is still work in progress and tested manually by building the deploying the scheduler. There may be unit test failures which I will fix so please don't merge the PR until that's done.

Does this PR introduce a user-facing change?:
Yes. [Rest of the section TBA]

karmada-bot · 2024-08-26T06:08:30Z

Welcome @bharathguvvala! It looks like this is your first PR to karmada-io/karmada 🎉

chaosi-zju · 2024-08-26T07:10:07Z

hi @bharathguvvala, really glad to see your PR submission!

By the way, here may be some minor problems with the ci now, and can you please fix the ci and squash the commit records into one? thanks~

bharathguvvala · 2024-08-28T06:16:54Z

@chaosi-zju Sure, will do that.

chaosi-zju · 2024-09-21T10:24:59Z

pkg/controllers/multiclusterservice/endpointslice_dispatch_controller.go

@@ -301,7 +301,7 @@ func (c *EndpointsliceDispatchController) cleanOrphanDispatchedEndpointSlice(ctx
 	return nil
 }

-func (c *EndpointsliceDispatchController) dispatchEndpointSlice(ctx context.Context, work *workv1alpha1.Work, mcs *networkingv1alpha1.MultiClusterService) error {
+func (c *EndpointsliceDispatchController) dispatchEndpointSlice(_ context.Context, work *workv1alpha1.Work, mcs *networkingv1alpha1.MultiClusterService) error {


hi @bharathguvvala, your this line may be submitted by mistake, ctx is required~

Suggested change

func (c *EndpointsliceDispatchController) dispatchEndpointSlice(_ context.Context, work *workv1alpha1.Work, mcs *networkingv1alpha1.MultiClusterService) error {

func (c *EndpointsliceDispatchController) dispatchEndpointSlice(ctx context.Context, work *workv1alpha1.Work, mcs *networkingv1alpha1.MultiClusterService) error {

yea this seems like a mistake. will correct it.

chaosi-zju · 2024-09-21T10:44:09Z

pkg/scheduler/scheduler.go

@@ -657,7 +661,7 @@ func (s *Scheduler) scheduleClusterResourceBindingWithClusterAffinity(crb *workv
 	return err
 }

-func (s *Scheduler) scheduleClusterResourceBindingWithClusterAffinities(crb *workv1alpha2.ClusterResourceBinding) error {
+func (s *Scheduler) scheduleClusterResourceBindingWithClusterAffinities(crb *workv1alpha2.ClusterResourceBinding, performFreshScheduling bool) error {


hi, adding a parameter performFreshScheduling is indeed works, but may be not the best practice~

It just occurred to me that we have similar logic in:

karmada/pkg/scheduler/core/assignment.go

Lines 109 to 115 in 673a603

// the assignment mode is defaults to Steady to minimizes disruptions and preserves the balance across clusters.

expectAssignmentMode := Steady

// when spec.rescheduleTriggeredAt is updated, it represents a rescheduling is manually triggered by user, and the

// expected behavior of this action is to do a complete recalculation without referring to last scheduling results.

if util.RescheduleRequired(spec.RescheduleTriggeredAt, status.LastScheduledTime) {

expectAssignmentMode = Fresh

}

may be we can also use util.RescheduleRequired(spec.RescheduleTriggeredAt, status.LastScheduledTime) to judge whether need to reset the affinityIndex, since now a fresh reschedule is defaults to reset the index of ClusterAffinities, so we can do just like:

if util.RescheduleRequired(crb.Spec.RescheduleTriggeredAt, crb.Status.LastScheduledTime) { affinityIndex = 0 }

I did this to keep the status quo functionality. I will change it based on your suggestion.

thanks~
By the way, I'd appreciate it if you could rebase the latest master code and squash the commit log into a single record~

Done incorporating these changes. Currently there's only one scheduler worker which I think will continue to stay that way. If there are more then that could cause a race condition where the condition util.RescheduleRequired could evaluate to false while deciding scheduled cluster(s) after first evaluating to true .

codecov-commenter · 2024-10-21T06:08:54Z

⚠️ Please install the to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

Attention: Patch coverage is 0% with 6 lines in your changes missing coverage. Please review.

Project coverage is 48.17%. Comparing base (cf7ac41) to head (762f992).

Files with missing lines	Patch %	Lines
pkg/scheduler/scheduler.go	0.00%	4 Missing and 2 partials ⚠️

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #5425      +/-   ##
==========================================
- Coverage   48.18%   48.17%   -0.02%     
==========================================
  Files         664      664              
  Lines       54799    54805       +6     
==========================================
- Hits        26405    26402       -3     
- Misses      26680    26686       +6     
- Partials     1714     1717       +3

Flag	Coverage Δ
unittests	`48.17% <0.00%> (-0.02%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

chaosi-zju · 2024-11-04T02:54:51Z

/retest

chaosi-zju · 2024-11-04T03:08:55Z

Hi, there are still two nits:

The CI of this PR failed due to it wasn't signed off, usually please use git commit -s -m 'your message ' or git commit -m ' Signed-off-by: AuthorName <[email protected]> \n <other message> ' to pass DCO (detail guideline can refer to: https://probot.github.io/apps/dco/).
I see this PR has 26 commit record, can you please squash these 26 commits into one commit, you can refer to: https://stackoverflow.com/questions/5189560/how-do-i-squash-my-last-n-commits-together

thanks~

karmada-bot · 2024-12-13T07:34:39Z

Adding label do-not-merge/contains-merge-commits because PR contains merge commits, which are not allowed in this repository.
Use git rebase to reapply your commits on top of the target branch. Detailed instructions for doing so can be found here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

bharathguvvala · 2024-12-13T08:32:16Z

@chaosi-zju I've tried to squash the commits and sort of messed it up. Those squashed commits aren't showing up in the branch. Can we do "squash and merge" while merging this PR?

chaosi-zju · 2024-12-13T10:30:20Z

I've tried to squash the commits and sort of messed it up. Those squashed commits aren't showing up in the branch. Can we do "squash and merge" while merging this PR?

can you try one more time by following command:

git remote add upstream https://github.com/karmada-io/karmada.git
git fetch upstream
git rebase upstream/master

then if you successfully rebased, you can directly push tou commit,

but you may also encounter some rebase conflict, in that case you should manually resolve the conflict and execute git rebase --continue

Signed-off-by: Bharath Raghavendra Reddy Guvvala <[email protected]>

karmada-bot · 2025-01-14T16:27:02Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign garrybest for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

bharathguvvala · 2025-01-14T16:27:35Z

@chaosi-zju can you check now?

karmada-bot added kind/bug Categorizes issue or PR as related to a bug. do-not-merge/contains-merge-commits Indicates a PR which contains merge commits. labels Aug 26, 2024

karmada-bot requested review from chaosi-zju, chaunceyjiang, Garrybest, jwcesign, liangyuanpeng, whitewindmills, XiShanYongYe-Chang and zhzhuang-zju August 26, 2024 06:08

karmada-bot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Aug 26, 2024

chaosi-zju reviewed Sep 21, 2024

View reviewed changes

karmada-bot added size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Oct 21, 2024

bharathguvvala force-pushed the rebalancefix branch from 0f4c0df to 292904e Compare December 13, 2024 07:25

karmada-bot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed do-not-merge/contains-merge-commits Indicates a PR which contains merge commits. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Dec 13, 2024

bharathguvvala force-pushed the rebalancefix branch from 292904e to 9e9d837 Compare December 13, 2024 07:34

karmada-bot added the do-not-merge/contains-merge-commits Indicates a PR which contains merge commits. label Dec 13, 2024

bharathguvvala force-pushed the rebalancefix branch from 9e9d837 to 30d242e Compare December 13, 2024 07:46

fixed rebalancer logic

a37e17d

Signed-off-by: Bharath Raghavendra Reddy Guvvala <[email protected]>

bharathguvvala force-pushed the rebalancefix branch from 762f992 to a37e17d Compare January 14, 2025 16:25

bharathguvvala force-pushed the rebalancefix branch from d17b81b to a37e17d Compare January 14, 2025 16:26

karmada-bot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Jan 14, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fixed cluster affinity scheduling evaluation order when scheduling is triggered via WorkloadRebalancer #5425

Fixed cluster affinity scheduling evaluation order when scheduling is triggered via WorkloadRebalancer #5425

bharathguvvala commented Aug 26, 2024 •

edited

Loading

karmada-bot commented Aug 26, 2024

chaosi-zju commented Aug 26, 2024 •

edited

Loading

bharathguvvala commented Aug 28, 2024

chaosi-zju Sep 21, 2024

bharathguvvala Sep 25, 2024

chaosi-zju Sep 21, 2024

bharathguvvala Sep 25, 2024

chaosi-zju Sep 26, 2024

bharathguvvala Oct 21, 2024

codecov-commenter commented Oct 21, 2024 •

edited

Loading

chaosi-zju commented Nov 4, 2024

chaosi-zju commented Nov 4, 2024

karmada-bot commented Dec 13, 2024

bharathguvvala commented Dec 13, 2024

chaosi-zju commented Dec 13, 2024

karmada-bot commented Jan 14, 2025

bharathguvvala commented Jan 14, 2025

	func (c EndpointsliceDispatchController) dispatchEndpointSlice(_ context.Context, work workv1alpha1.Work, mcs *networkingv1alpha1.MultiClusterService) error {
	func (c EndpointsliceDispatchController) dispatchEndpointSlice(ctx context.Context, work workv1alpha1.Work, mcs *networkingv1alpha1.MultiClusterService) error {

	// the assignment mode is defaults to Steady to minimizes disruptions and preserves the balance across clusters.
	expectAssignmentMode := Steady
	// when spec.rescheduleTriggeredAt is updated, it represents a rescheduling is manually triggered by user, and the
	// expected behavior of this action is to do a complete recalculation without referring to last scheduling results.
	if util.RescheduleRequired(spec.RescheduleTriggeredAt, status.LastScheduledTime) {
	expectAssignmentMode = Fresh
	}

Fixed cluster affinity scheduling evaluation order when scheduling is triggered via WorkloadRebalancer #5425

Are you sure you want to change the base?

Fixed cluster affinity scheduling evaluation order when scheduling is triggered via WorkloadRebalancer #5425

Conversation

bharathguvvala commented Aug 26, 2024 • edited Loading

karmada-bot commented Aug 26, 2024

chaosi-zju commented Aug 26, 2024 • edited Loading

bharathguvvala commented Aug 28, 2024

chaosi-zju Sep 21, 2024

Choose a reason for hiding this comment

bharathguvvala Sep 25, 2024

Choose a reason for hiding this comment

chaosi-zju Sep 21, 2024

Choose a reason for hiding this comment

bharathguvvala Sep 25, 2024

Choose a reason for hiding this comment

chaosi-zju Sep 26, 2024

Choose a reason for hiding this comment

bharathguvvala Oct 21, 2024

Choose a reason for hiding this comment

codecov-commenter commented Oct 21, 2024 • edited Loading

Codecov Report

chaosi-zju commented Nov 4, 2024

chaosi-zju commented Nov 4, 2024

karmada-bot commented Dec 13, 2024

bharathguvvala commented Dec 13, 2024

chaosi-zju commented Dec 13, 2024

karmada-bot commented Jan 14, 2025

bharathguvvala commented Jan 14, 2025

bharathguvvala commented Aug 26, 2024 •

edited

Loading

chaosi-zju commented Aug 26, 2024 •

edited

Loading

codecov-commenter commented Oct 21, 2024 •

edited

Loading