Skip to content

Commit

Permalink
fixed rebalancer logic
Browse files Browse the repository at this point in the history
Signed-off-by: Bharath Raghavendra Reddy Guvvala <[email protected]>
  • Loading branch information
karmada-bot authored and bharathguvvala committed Jan 14, 2025
2 parents 43f2953 + b743a3b commit a37e17d
Show file tree
Hide file tree
Showing 2 changed files with 88 additions and 0 deletions.
Original file line number Diff line number Diff line change
@@ -0,0 +1,82 @@
---
title: Support Priority Class Configuration for Karmada Control Plane Components
authors:
- "@jabellard"
reviewers:
- "@RainbowMango"
approvers:
- "@RainbowMango"

creation-date: 2025-01-01

---

# Support Priority Class Configuration for Karmada Control Plane Components

## Summary

This proposal aims to extend the Karmada operator by introducing support for configuring the priority class name for Karmada control plane components.
By enabling users to configure a custom priority class name, this feature ensures critical components are scheduled with appropriate priority, enhancing overall system reliability and stability.

## Motivation

Currently, the priority class name for Karmada components is hardcoded to `system-node-critical` for some components, while others do not specify a priority class at all. This limitation can compromise
the reliability and stability of the system in environments where scheduling of critical components is essential.

By allowing users to configure the priority class name, this feature ensures:

- Greater control over scheduling of critical Karmada control plane components, enhancing system reliability and stability.
- Alignment with organizational policies for resource prioritization and workload management.
- Flexibility to adapt priority classes for specific operational environments and use cases.

### Goals
- Provide a mechanism for configuring the scheduling priority of all in-cluster Karmada control plane components.
- Ensure the feature integrates seamlessly with existing deployments while maintaining backward compatibility.

### Non-Goals

- Address scheduling priorities for components outside the Karmada control plane.

## Proposal

Introduce a new optional `priorityClassName` field in the `CommonSettings` struct, which is used across all Karmada components.

### API Changes

```go
// CommonSettings describes the common settings of all Karmada Components.
type CommonSettings struct {

// PriorityClassName specifies the priority class name for the component.
// If not specified, it defaults to "system-node-critical".
// +kubebuilder:default="system-node-critical"
// +optional
PriorityClassName string `json:"priorityClassName,omitempty"`

// Other, existing fields omitted for brevity...
}

```
### User Stories

#### Story 1
As an infrastructure engineer, I need to configure the priority class for Karmada control plane components to ensure critical components are reliably scheduled to ensure system stability and reliability.

#### Story 2
As an infrastructure engineer managing a multi-tenant cluster, I want the ability to override the default priority class for Karmada control plane components with a custom priority class that aligns with my organization’s policies, ensuring reliable resource allocation and system stability across workloads.

### Risks and Mitigations

1. *Backward Compatibility*: Existing deployments might rely on the current hardcoded `system-node-critical` priority class for some components.

- *Mitigation*: The `priorityClassName` field defaults to `system-node-critical` when not explicitly specified, preserving the current behavior.

## Design Details

During the reconciliation process, the Karmada operator will:

- Check if `priorityClassName` is specified in the component’s `CommonSettings`.
- If specified:
- Apply the specified priority class to the component’s Pod spec.
- If not specified:
- Default to `system-node-critical` to maintain backward compatibility.
6 changes: 6 additions & 0 deletions pkg/scheduler/scheduler.go
Original file line number Diff line number Diff line change
Expand Up @@ -546,6 +546,9 @@ func (s *Scheduler) scheduleResourceBindingWithClusterAffinities(rb *workv1alpha
)

affinityIndex := getAffinityIndex(rb.Spec.Placement.ClusterAffinities, rb.Status.SchedulerObservedAffinityName)
if util.RescheduleRequired(rb.Spec.RescheduleTriggeredAt, rb.Status.LastScheduledTime) {
affinityIndex = 0
}
updatedStatus := rb.Status.DeepCopy()
for affinityIndex < len(rb.Spec.Placement.ClusterAffinities) {
klog.V(4).Infof("Schedule ResourceBinding(%s/%s) with clusterAffiliates index(%d)", rb.Namespace, rb.Name, affinityIndex)
Expand Down Expand Up @@ -684,6 +687,9 @@ func (s *Scheduler) scheduleClusterResourceBindingWithClusterAffinities(crb *wor
)

affinityIndex := getAffinityIndex(crb.Spec.Placement.ClusterAffinities, crb.Status.SchedulerObservedAffinityName)
if util.RescheduleRequired(rb.Spec.RescheduleTriggeredAt, rb.Status.LastScheduledTime) {

Check failure on line 690 in pkg/scheduler/scheduler.go

View workflow job for this annotation

GitHub Actions / Test on Kubernetes (v1.29.0)

undefined: rb

Check failure on line 690 in pkg/scheduler/scheduler.go

View workflow job for this annotation

GitHub Actions / Test on Kubernetes (v1.30.0)

undefined: rb

Check failure on line 690 in pkg/scheduler/scheduler.go

View workflow job for this annotation

GitHub Actions / Test on Kubernetes (v1.31.0)

undefined: rb

Check failure on line 690 in pkg/scheduler/scheduler.go

View workflow job for this annotation

GitHub Actions / init with config file (v1.29.0)

undefined: rb

Check failure on line 690 in pkg/scheduler/scheduler.go

View workflow job for this annotation

GitHub Actions / init with config file (v1.30.0)

undefined: rb

Check failure on line 690 in pkg/scheduler/scheduler.go

View workflow job for this annotation

GitHub Actions / init with config file (v1.31.0)

undefined: rb

Check failure on line 690 in pkg/scheduler/scheduler.go

View workflow job for this annotation

GitHub Actions / Test on Kubernetes (v1.29.0)

undefined: rb

Check failure on line 690 in pkg/scheduler/scheduler.go

View workflow job for this annotation

GitHub Actions / Test on Kubernetes (v1.30.0)

undefined: rb

Check failure on line 690 in pkg/scheduler/scheduler.go

View workflow job for this annotation

GitHub Actions / Test on Kubernetes (v1.31.0)

undefined: rb

Check failure on line 690 in pkg/scheduler/scheduler.go

View workflow job for this annotation

GitHub Actions / Test on Kubernetes (v1.30.0)

undefined: rb

Check failure on line 690 in pkg/scheduler/scheduler.go

View workflow job for this annotation

GitHub Actions / Test on Kubernetes (v1.31.0)

undefined: rb

Check failure on line 690 in pkg/scheduler/scheduler.go

View workflow job for this annotation

GitHub Actions / lint

undefined: rb) (typecheck)

Check failure on line 690 in pkg/scheduler/scheduler.go

View workflow job for this annotation

GitHub Actions / lint

undefined: rb) (typecheck)

Check failure on line 690 in pkg/scheduler/scheduler.go

View workflow job for this annotation

GitHub Actions / lint

undefined: rb (typecheck)

Check failure on line 690 in pkg/scheduler/scheduler.go

View workflow job for this annotation

GitHub Actions / lint

undefined: rb) (typecheck)

Check failure on line 690 in pkg/scheduler/scheduler.go

View workflow job for this annotation

GitHub Actions / compile

undefined: rb

Check failure on line 690 in pkg/scheduler/scheduler.go

View workflow job for this annotation

GitHub Actions / Test on Kubernetes (v1.29.0)

undefined: rb
affinityIndex = 0
}
updatedStatus := crb.Status.DeepCopy()
for affinityIndex < len(crb.Spec.Placement.ClusterAffinities) {
klog.V(4).Infof("Schedule ClusterResourceBinding(%s) with clusterAffiliates index(%d)", crb.Name, affinityIndex)
Expand Down

0 comments on commit a37e17d

Please sign in to comment.