Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SERVE] Allow adjustment of scaling policies without redeployment #4442

Open
JGSweets opened this issue Dec 5, 2024 · 1 comment
Open

[SERVE] Allow adjustment of scaling policies without redeployment #4442

JGSweets opened this issue Dec 5, 2024 · 1 comment

Comments

@JGSweets
Copy link
Contributor

JGSweets commented Dec 5, 2024

Currently, when altering the replica_policy, update runs a pseudo blue-green deployment in the sense it launches all new resources.

Preferably, if only the replica_policy is changing, it alters the policy itself without deploying /tearing down new instances unless required by the new policy.


Example 1:
Init: Currently, 2 resources are running, but the min_replica is set to 3.
Result: Only 1 instance is launched.

Example 2:
Init: Currently, 2 resources are running, but the min_replica is set to 1 and qps would not be met if scaled down.
Result: 1 instance is torn down.


Solution Options:

  1. Have update check to make sure only the replica_info has changed with a hash.
  2. Use a flag with update that allows altering just the replica policy
  3. Have a separate update endpoint which allows updating the replica policy.

Version & Commit info:
skypilot, version 0.7.0
skypilot, commit 3f62588

@Michaelvll
Copy link
Collaborator

cc'ing @cblmemo

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants