Skip to content

Commit

Permalink
[Docs] Update k8s docs (#4352)
Browse files Browse the repository at this point in the history
* Update docs

* Update docs
  • Loading branch information
romilbhardwaj authored Nov 14, 2024
1 parent fa5c1ba commit 1bcc08e
Show file tree
Hide file tree
Showing 2 changed files with 18 additions and 16 deletions.
14 changes: 0 additions & 14 deletions docs/source/reference/kubernetes/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -103,17 +103,3 @@ Table of Contents
Getting Started <kubernetes-getting-started>
kubernetes-setup
kubernetes-troubleshooting


Features and Roadmap
--------------------

Kubernetes support is under active development. Some features are in progress and will be released soon:

* CPU and GPU Tasks - ✅ Available
* Auto-down - ✅ Available
* Storage mounting - ✅ Available on x86_64 clusters
* Multi-node tasks - ✅ Available
* Custom images - ✅ Available
* Opening ports and exposing services - ✅ Available
* Multiple Kubernetes Clusters - 🚧 In progress
20 changes: 18 additions & 2 deletions docs/source/reference/kubernetes/kubernetes-getting-started.rst
Original file line number Diff line number Diff line change
Expand Up @@ -261,9 +261,25 @@ After launching the cluster with :code:`sky launch -c myclus task.yaml`, you can
FAQs
----

* **Can I use multiple Kubernetes clusters with SkyPilot?**

SkyPilot can work with multiple Kubernetes contexts set in your kubeconfig file. By default, SkyPilot will use the current active context. To use a different context, change your current context using :code:`kubectl config use-context <context-name>`.

If you would like to use multiple contexts seamlessly during failover, check out the :code:`allowed_contexts` feature in :ref:`config-yaml`.

* **Are autoscaling Kubernetes clusters supported?**

To run on an autoscaling cluster, you may need to adjust the resource provisioning timeout (:code:`Kubernetes.TIMEOUT` in `clouds/kubernetes.py`) to a large value to give enough time for the cluster to autoscale. We are working on a better interface to adjust this timeout - stay tuned!
To run on autoscaling clusters, set the :code:`provision_timeout` key in :code:`~/.sky/config.yaml` to a large value to give enough time for the cluster autoscaler to provision new nodes.
This will direct SkyPilot to wait for the cluster to scale up before failing over to the next candidate resource (e.g., next cloud).

If you are using GPUs in a scale-to-zero setting, you should also set the :code:`autoscaler` key to the autoscaler type of your cluster. More details in :ref:`config-yaml`.

.. code-block:: yaml
# ~/.sky/config.yaml
kubernetes:
provision_timeout: 900 # Wait 15 minutes for nodes to get provisioned before failover. Set to -1 to wait indefinitely.
autoscaler: gke # [gke, karpenter, generic]; required if using GPUs in scale-to-zero setting
* **Can SkyPilot provision a Kubernetes cluster for me? Will SkyPilot add more nodes to my Kubernetes clusters?**

Expand All @@ -280,7 +296,7 @@ FAQs
* **How can I specify custom configuration for the pods created by SkyPilot?**

You can override the pod configuration used by SkyPilot by setting the :code:`pod_config` key in :code:`~/.sky/config.yaml`.
The value of :code:`pod_config` should be a dictionary that follows the `Kubernetes Pod API <https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.22/#pod-v1-core>`_.
The value of :code:`pod_config` should be a dictionary that follows the `Kubernetes Pod API <https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.26/#pod-v1-core>`_.

For example, to set custom environment variables and attach a volume on your pods, you can add the following to your :code:`~/.sky/config.yaml` file:

Expand Down

0 comments on commit 1bcc08e

Please sign in to comment.