KEP-4943 #4947

bernot-dev · 2024-11-04T18:15:40Z

One-line PR description: Adding new KEP

Issue link: Vertical Pod Autoscaling for Workloads with Heterogeneous Resource Requirements #4943

Other comments:

k8s-ci-robot · 2024-11-04T18:15:48Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: bernot-dev
Once this PR has been reviewed and has the lgtm label, please assign maciekpytel for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

keps/sig-autoscaling/OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

k8s-ci-robot · 2024-11-04T18:15:50Z

Hi @bernot-dev. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

adrianmoisey · 2024-11-05T17:21:44Z

I know this is very much still in draft, but I thought I'd ask now. Does this work depend on in-place pod resizing?

bernot-dev · 2024-11-06T15:52:51Z

@adrianmoisey This does not depend on in-place pod resizing any more than existing VPA. However, it will benefit from it when that feature lands.

raywainman · 2024-11-07T19:38:09Z

This is super cool! I think this could definitely tackle a gap that has existed in vertically scaling daemon sets.

Some questions from me:

What is some of your high level thinking here around how to configure this?
Would we keep a histogram of usage per replica in this case?
What do we do about new pods being added to the daemonset?

(Feel free to also tell me to wait until this is captured in the KEP :), I realize this is still a draft proposal!)

bernot-dev · 2024-11-08T14:25:28Z

@raywainman Thanks for the questions! These are closely aligned with the first questions I am looking to tackle as I start filling in the details. I am going to be looking for ways to leverage existing code to whatever extent makes sense. I will have to make myself a bit more familiar with existing code before I can find those opportunities.

I'm hoping to sidestep the question of, "What do we do about new pods being added to the daemonset?"

Retaining existing behavior seems to be the best way forward for now. I can imagine reasonable arguments for a variety of behaviors in different situations. Perhaps it could become configurable in the future, but I would rather keep that out of scope for now.

adrianmoisey · 2024-11-10T14:20:16Z

@adrianmoisey This does not depend on in-place pod resizing any more than existing VPA. However, it will benefit from it when that feature lands.

Good to know. I know I may be jumping the gun here, so apologies for that, I know you are still working on the KEP.

The reason I asked about in-place is because I just assumed the solution you would use, so let me take a step back.

At the moment VPA admission-controller receives a Pod before it's scheduled to a node (to my understanding, please correct me if I am wrong), meaning that the VPA admission-controller won't know which node a specific Pod will land on.
Do you have any ideas on how to get around this limitation for this KEP?

EDIT: I'm entirely wrong! Turns out that DaemonSets use node affinity to tell the scheduler where to schedule a Pod. I didn't know that!
See https://github.com/kubernetes/kubernetes/blob/v1.31.2/pkg/controller/daemon/daemon_controller.go#L1014-L1019
and
https://github.com/kubernetes/kubernetes/blob/v1.31.2/pkg/controller/daemon/util/daemonset_util.go#L173-L221

So I guess that's the answer, the node can be fetched from there.

SECOND EDIT: What about other workloads, besides DaemonSets?

omerap12 · 2024-11-24T07:11:37Z

We're only talking about DaemonSets, right? Because if we're dealing with Deployments, it could cause some issues. For example, when you change the resources of a pod, the new pod might be scheduled on a different node or even in a different zone, depending on how the Deployment is configured. This could impact the load distribution across pods. If Deployments are also in scope, we might need to consider these factors.

k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Nov 4, 2024

k8s-ci-robot requested review from gjtempleton and MaciekPytel November 4, 2024 18:15

k8s-ci-robot added the kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory label Nov 4, 2024

k8s-ci-robot added sig/autoscaling Categorizes an issue or PR as relevant to SIG Autoscaling. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Nov 4, 2024

k8s-ci-robot added the size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. label Nov 4, 2024

bernot-dev marked this pull request as draft November 4, 2024 18:16

k8s-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Nov 4, 2024

KEP-4943

5f9c2c9

bernot-dev force-pushed the vpa branch from 3c7650e to 5f9c2c9 Compare November 4, 2024 19:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

KEP-4943 #4947

KEP-4943 #4947

bernot-dev commented Nov 4, 2024 •

edited

Loading

k8s-ci-robot commented Nov 4, 2024

k8s-ci-robot commented Nov 4, 2024

adrianmoisey commented Nov 5, 2024

bernot-dev commented Nov 6, 2024

raywainman commented Nov 7, 2024

bernot-dev commented Nov 8, 2024

adrianmoisey commented Nov 10, 2024 •

edited

Loading

omerap12 commented Nov 24, 2024

KEP-4943 #4947

Are you sure you want to change the base?

KEP-4943 #4947

Conversation

bernot-dev commented Nov 4, 2024 • edited Loading

k8s-ci-robot commented Nov 4, 2024

k8s-ci-robot commented Nov 4, 2024

adrianmoisey commented Nov 5, 2024

bernot-dev commented Nov 6, 2024

raywainman commented Nov 7, 2024

bernot-dev commented Nov 8, 2024

adrianmoisey commented Nov 10, 2024 • edited Loading

omerap12 commented Nov 24, 2024

bernot-dev commented Nov 4, 2024 •

edited

Loading

adrianmoisey commented Nov 10, 2024 •

edited

Loading