Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

leader_balancer: implement and use even_node_load_constraint #24403

Merged
merged 5 commits into from
Dec 4, 2024

Conversation

ztlpn
Copy link
Contributor

@ztlpn ztlpn commented Dec 2, 2024

Leader balancer treats all CPU cores available in the cluster as independent and tries to balance leadership among them and not among nodes (i.e. the objective is that each core has the same number of leaders, and not each node). This leads to unintuitive results, especially when the number of raft groups is comparable to the number of available shards.

Add a low-priority node balancing constraint to fix that.

Fixes https://redpandadata.atlassian.net/browse/CORE-8237

Backports Required

  • none - not a bug fix
  • none - this is a backport
  • none - issue does not exist in previous branches
  • none - papercut/not impactful enough to backport
  • v24.3.x
  • v24.2.x
  • v24.1.x

Release Notes

Improvements

  • Leader balancer: don't treat each core as independent and balance total number of leaders on each node as well.

@ztlpn ztlpn force-pushed the leader-balancer-even-node-load branch 2 times, most recently from d555bc2 to 866fcaa Compare December 3, 2024 23:14
@vbotbuildovich
Copy link
Collaborator

vbotbuildovich commented Dec 4, 2024

Retry command for Build#59199

please wait until all jobs are finished before running the slash command

/ci-repeat 1
tests/rptest/tests/license_enforcement_test.py::LicenseEnforcementTest.test_license_enforcement@{"clean_node_after_recovery":true,"clean_node_before_recovery":false}
tests/rptest/tests/license_enforcement_test.py::LicenseEnforcementTest.test_escape_hatch_license_variable@{"clean_node_before_upgrade":false}
tests/rptest/tests/license_enforcement_test.py::LicenseEnforcementTest.test_license_enforcement@{"clean_node_after_recovery":false,"clean_node_before_recovery":true}
tests/rptest/tests/license_enforcement_test.py::LicenseEnforcementTest.test_license_enforcement@{"clean_node_after_recovery":false,"clean_node_before_recovery":false}
tests/rptest/tests/license_enforcement_test.py::LicenseEnforcementTest.test_escape_hatch_license_variable@{"clean_node_before_upgrade":true}
tests/rptest/tests/license_enforcement_test.py::LicenseEnforcementTest.test_license_enforcement@{"clean_node_after_recovery":true,"clean_node_before_recovery":true}

@vbotbuildovich
Copy link
Collaborator

vbotbuildovich commented Dec 4, 2024

non flaky failures in https://buildkite.com/redpanda/redpanda/builds/59199#01938f10-c757-49b3-b82e-bc73aab958ca:

"rptest.tests.license_enforcement_test.LicenseEnforcementTest.test_license_enforcement.clean_node_before_recovery=True.clean_node_after_recovery=False"

non flaky failures in https://buildkite.com/redpanda/redpanda/builds/59199#01938f16-88ee-4361-8861-e4873a4b5470:

"rptest.tests.license_enforcement_test.LicenseEnforcementTest.test_license_enforcement.clean_node_before_recovery=True.clean_node_after_recovery=False"

non flaky failures in https://buildkite.com/redpanda/redpanda/builds/59199#01938f16-88f0-4745-84ab-85b9672d2dea:

"rptest.tests.license_enforcement_test.LicenseEnforcementTest.test_escape_hatch_license_variable.clean_node_before_upgrade=True"

non flaky failures in https://buildkite.com/redpanda/redpanda/builds/59199#01938f10-c758-48ed-bddd-0a3728aa65bd:

"rptest.tests.license_enforcement_test.LicenseEnforcementTest.test_escape_hatch_license_variable.clean_node_before_upgrade=True"

@dotnwat
Copy link
Member

dotnwat commented Dec 4, 2024

What's an example of the imbalance that occurs when only taking cores into account?

@ztlpn
Copy link
Contributor Author

ztlpn commented Dec 4, 2024

@dotnwat It's in the unit test - something like this:

  • 3-node cluster, 10 shards each
  • 20 partitions with rf=3
  • all leaders on nodes 0 and 1

From the shard POV the distribution is balanced - each shard has 0 or 1 leaders. But node-wise distribution is very imbalanced.

return reassignment_opt;
auto node_load_diff = _enlc.evaluate(reassignment);
if (node_load_diff < -error_jitter) {
continue;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check doesn't make a difference, as we anyway continue the loop if the one below does not hold. Is it for the future when we evaluate more constraints of even lower priority below?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it is just for symmetry. Although I guess one more constraint and I'm going to put them in a vector :)

@dotnwat
Copy link
Member

dotnwat commented Dec 4, 2024

@dotnwat It's in the unit test - something like this:

  • 3-node cluster, 10 shards each
  • 20 partitions with rf=3
  • all leaders on nodes 0 and 1

From the shard POV the distribution is balanced - each shard has 0 or 1 leaders. But node-wise distribution is very imbalanced.

Ahh, I see too bad we can't have half a leader. Missed the unit test explanation, thanks!

ztlpn added 3 commits December 4, 2024 17:52
Leader balancer treats all CPU cores available in the cluster as independent and
tries to balance leadership among them and not among nodes (i.e. the objective is
that each core has the same number of leaders, and not each node). This
leads to unintuitive results, especially when the number of raft groups
is comparable to the number of available shards.

Add a low-priority node balancing constraint to fix that.
Previously it didn't copy empty shards in the index.
Makes it easier to read the stats.
@ztlpn ztlpn force-pushed the leader-balancer-even-node-load branch from 866fcaa to 1305cdd Compare December 4, 2024 16:52
@ztlpn ztlpn force-pushed the leader-balancer-even-node-load branch from 1305cdd to 049e258 Compare December 4, 2024 16:54
@ztlpn ztlpn requested a review from bashtanov December 4, 2024 16:55
@ztlpn ztlpn merged commit ef1c094 into redpanda-data:dev Dec 4, 2024
17 checks passed
@vbotbuildovich
Copy link
Collaborator

/backport v24.3.x

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants