Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: Add an option that allows users to choose not to generate the root queue capacity based on the actual cluster resources. #3947

Open
bogo-y opened this issue Dec 31, 2024 · 9 comments
Labels
area/scheduling kind/feature Categorizes issue or PR as related to a new feature.

Comments

@bogo-y
Copy link
Contributor

bogo-y commented Dec 31, 2024

What is the problem you're trying to solve

The scheduler determines the root queue's real-capability by summing the available resources of all nodes, which may causes vc-scheduler panic here or beging failed here. It is difficult to ensure the real-capability of a cluster which enabled cluster-autoscaler always greater than total deserved and guarantee manually, and now there is not an effective method to manage queues that declared excessive resources automatically.

Describe the solution you'd like

I think we can change the starting point from root to root's children when we need calculating real-capability, and just let the children's real-capability be there capacity. That can lead to the decoupling of the queues' quota from physical resources, so that it will change the result from crash to pod pending when the physical resources less that total guarantee or total deserved.
Of cause we should make it configurable in order to provide users with two modes.

Additional context

related to #3929

@bogo-y bogo-y added the kind/feature Categorizes issue or PR as related to a new feature. label Dec 31, 2024
@bogo-y
Copy link
Contributor Author

bogo-y commented Dec 31, 2024

https://github.com/volcano-sh/volcano/blob/master/pkg/scheduler/plugins/capacity/capacity.go#L485
I wrote demo code to briefly describe the core changes we will make:
image
I am glad to apply a pull request after we reaching an agreement.

@bogo-y
Copy link
Contributor Author

bogo-y commented Dec 31, 2024

/area scheduling
/PTAL @Monokaix @JesseStutler @lowang-bh

@kingeasternsun
Copy link
Contributor

We could not assume that the parent's realCapacity is greater than the total guarantee of children either, so should we avoid panic here by checking before? If check failed, return error?
As for the root.realCapability, we can fix it like this attr.realCapability = helpers.Max(cp.totalResource, totalGuarantee)?

@bogo-y
Copy link
Contributor Author

bogo-y commented Jan 2, 2025

We could not assume that the parent's realCapacity is greater than the total guarantee of children either, so should we avoid panic here by checking before? If check failed, return error? As for the root.realCapability, we can fix it like this attr.realCapability = helpers.Max(cp.totalResource, totalGuarantee)?

Sounds good ideas.
I am not sure that if you think we need a new mode to make the decoupling of the queues' quota from physical resources? Or just modify the existing features?

@JesseStutler
Copy link
Member

So the key to the problem you want to express is that when combined with the cluster autoscaler, we cannot panic at that place. I agree with @kingeasternsun 's solution

@bogo-y
Copy link
Contributor Author

bogo-y commented Jan 2, 2025

So the key to the problem you want to express is that when combined with the cluster autoscaler, we cannot panic at that place. I agree with @kingeasternsun 's solution

I think we may let the root queue's real-capability be the max of total-resource and total-deserved.
Do we need an argument to switch the logic? What's your opinion about it?

@Monokaix
Copy link
Member

Monokaix commented Jan 3, 2025

From the user's perspective, they should not be aware too much of the code detail, so it's no necessary to add another param or something to add the burden to them, we should provide a best practice and make sure the code is robust enough to cover as many as possible unexpected cases, and also give trouble shooting methods to make users use queue features more easily, and I think we should add some notices to our queue related docs. @JesseStutler
This #3106 has made changes to avoid panic, we can also do some check here.

@Monokaix
Copy link
Member

Monokaix commented Jan 3, 2025

So the key to the problem you want to express is that when combined with the cluster autoscaler, we cannot panic at that place. I agree with @kingeasternsun 's solution

I think we may let the root queue's real-capability be the max of total-resource and total-deserved. Do we need an argument to switch the logic? What's your opinion about it?

If so, we will get a wrong real-capability if total-deserved > total-resource, the real-capability should reflect the real resource view.

@JesseStutler
Copy link
Member

I think as mentioned in #3929, just don’t panic when checking realCapability and Guarantee is enough. We shouldn’t delve into overly complex edge cases. I even thought that with autoscaler, we should consider scaling up the cluster resource rather than preempting/reclaiming.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/scheduling kind/feature Categorizes issue or PR as related to a new feature.
Projects
None yet
Development

No branches or pull requests

5 participants