-
Notifications
You must be signed in to change notification settings - Fork 529
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make existing node into an cluster #2542
Comments
Hi @wisechat-eng - we currently do not support using SkyPilot to manage externally created VMs in GCP. Can you share a bit more about your use case and why you want to manage existing nodes instead of creating new ones with SkyPilot? |
Hi, we have a lot of code and environment already set up in the existing node. I want to migrate to have skypilot to manage the note using cluster. Instead of create a new one, I'm wondering if I can just reuse the old one, but have skypilot to connect it as a cluster |
Hey @wisechat-eng, one quick way to get around this is to clone your original node with the following steps:
sky launch --cloud gcp -c <cluster-name> \
--image-id projects/<your-project-id>/global/machineImages/<your-machine-image-name> \
--instance-type <instance-type-of-your-original-instance> With the command above, you will clone a node from your previous node, and all the code / environment will be preserved in the new cluster managed by skypilot, and you can then safely delete your old node. |
make sense, thanks Michael. Another use case is that, me and my collegue may want to use SkyPilot to control the same node. Somehow if it is created from my side, he will not see it as a cluster |
We don't officially support sharing a SkyPilot cluster across users, but a workaround will be that you share your Hopefully, both of you will be able to see the same cluster in your |
@wisechat-eng To understand more, what's the reason that you desire to share a cluster? Is it due to cost/quota reasons (where launching a new node is not ideal), or is it for collaboration/debugging? |
@concretevitamin I'm not wisechat, but honestly both use cases are to consider. An other use case could be having a Databricks cluster with already provisioned GPU and want to use it with Skypilot cluster capabilities instead of create standalone VMs to achieve tasks. Skypilot would run as agent on target runtime and usable from local command line this way. |
@concretevitamin I'm working with a couple of startups that are working with bare metal machines and they would like to be able to use it through the SkyPilot interface. Right now they either launch using slurm or by logging into each node separately, but slurm is quite bloated for this use case and is annoying to maintain, so something that only requires existing SSH access to create launch the Skypilot runtime would be really attractive I think. |
@asaiacai - that's interesting. If the bare metal machines are self-owned/long-term rentals, have you considered deploying Kubernetes on them and then using SkyPilot + Kubernetes support? Kubernetes' extensive support for devops and observability tooling keeps the ops teams happy, while SkyPilot can support ML engineers who do not need to deal with Kubernetes APIs. |
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
Re-open this issue. Related to #3926 |
I have started a few existing nodes in GCP, and I want to make it into cluster so I can use SkyPilot to control. However not sure what is the best way to configure the existing nodes.
The text was updated successfully, but these errors were encountered: