Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[UX] Launch on existing cluster should be very fast #4157

Open
Michaelvll opened this issue Oct 23, 2024 · 1 comment
Open

[UX] Launch on existing cluster should be very fast #4157

Michaelvll opened this issue Oct 23, 2024 · 1 comment
Assignees
Labels

Comments

@Michaelvll
Copy link
Collaborator

Michaelvll commented Oct 23, 2024

A user reported that they are running sky launch but they find sky launch on existing cluster is very slow and the expect behavior is that:

  1. if cluster does not exist, provision the cluster and run the job
  2. if the cluster exists, run the job only (like exec), and skip all those time consuming steps, including skypilot runtime setup, waiting for ssh, and user setup.

Two ways to achieve this:

  1. Make the sky launch super fast on an existing cluster by caching the current state of a cluster and only re-setup the cluster when the runtime is stale.
  2. add an option to automatically use sky.exec when sky launch is run on an existing cluster.
@Michaelvll Michaelvll added the P0 label Oct 23, 2024
@Michaelvll Michaelvll changed the title [UX] Launch should skip the provisioner path when calling again [UX] Launch on existing cluster should be very fast Oct 23, 2024
@cg505
Copy link
Collaborator

cg505 commented Nov 15, 2024

This is mostly solved by sky launch --fast.
This is not turned on by default since it's very hard to tell when setup should be re-run.
We could probably turn the provisioning short-circuit in #4289 on by default.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants