-
Notifications
You must be signed in to change notification settings - Fork 529
Issues: skypilot-org/skypilot
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
[Docs] Update docs with latest distributed training examples
#4468
opened Dec 12, 2024 by
Michaelvll
[k8s] Fail to ssh into the head node on k8s
help wanted
Extra attention is needed
k8s
Kubernetes related items
#4461
opened Dec 11, 2024 by
alex000kim
[Core] Launching on a just launched existing cluster with
--fast
does not skip the provision
#4460
opened Dec 11, 2024 by
Michaelvll
[UX]
gh repo clone
fail to work after gh auth login
on cluster
#4459
opened Dec 11, 2024 by
Michaelvll
[Dev] Automatically source the sky environment for dev mode
good first issue
Good for newcomers
#4453
opened Dec 10, 2024 by
cblmemo
[UX] Additional message from OCI even though not enabled
good first issue
Good for newcomers
#4450
opened Dec 9, 2024 by
Michaelvll
[k8s] Extra attention is needed
k8s
Kubernetes related items
sky check k8s
fails when current context is not available, even if that context is not in allowed_contexts
help wanted
#4449
opened Dec 9, 2024 by
cg505
Latest skypilot image does not support azure accelerated networking and nccl
P0
#4448
opened Dec 8, 2024 by
visatish
[SERVE][AUTOSCALERS] Replica scaling sampling period and stability.
#4444
opened Dec 5, 2024 by
JGSweets
[SERVE] Allow adjustment of scaling policies without redeployment
#4442
opened Dec 5, 2024 by
JGSweets
Azure image-id from marketplace with :latest fails
good first issue
Good for newcomers
#4435
opened Dec 3, 2024 by
cg505
[DeepSpeed Example] Fail on AWS T4 due to package import issue
good first issue
Good for newcomers
#4434
opened Dec 3, 2024 by
yika-luo
[Core] Failure in pytorch distributed training code failed to get a job into FAILED state
triage
#4421
opened Nov 27, 2024 by
Michaelvll
[Storage] Support disable exclude .gitignore
good first issue
Good for newcomers
#4416
opened Nov 26, 2024 by
cblmemo
Pylint comments appear in the API documentation (e.g.,
# pylint: disable=line-too-long
)
#4405
opened Nov 24, 2024 by
andylizf
[k8s] L40 GPUs get detected as L4s
help wanted
Extra attention is needed
k8s
Kubernetes related items
#4404
opened Nov 24, 2024 by
romilbhardwaj
[k8s] Support exec based auth kubeconfigs on controllers
help wanted
Extra attention is needed
k8s
Kubernetes related items
#4379
opened Nov 17, 2024 by
romilbhardwaj
[k8s] Default image cannot install conda package in base env
k8s
Kubernetes related items
#4374
opened Nov 15, 2024 by
Michaelvll
Previous Next
ProTip!
Find all open issues with in progress development work with linked:pr.