-
Notifications
You must be signed in to change notification settings - Fork 236
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Inference on EKS - Trouble Deploying Ray Serve Cluster on EKS #446
Comments
@sanjeevrg89 can you please take a look? My guess is Karpenter is not able to provision the inf2 instance. |
@AditModi It's not reproducible on the latest blueprint. Please try again and let us know. |
This issue has been automatically marked as stale because it has been open 30 days |
I have the same error. I'm following along the Gen AI examples but I'm able unable to even run a single Llama2 model. I'm trying to run the models with inf2.8xl (two instances) and inf2-24xl (one instance). NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE NAME DESIRED WORKERS AVAILABLE WORKERS CPUS MEMORY GPUS STATUS AGE NAME SERVICE STATUS NUM SERVE ENDPOINTS Observations:
|
Hi, I'm encountering the same issue in latest blueprint. For any of the following models I've been encountered the same issue:
P.S: Is it possible to run those models with a lower EC2 Inf2.8xlarge instances instead of 48xlarge? |
Please describe your question here
I am encountering an issue while deploying Ray Serve Cluster on an EKS cluster following the documentation provided here.
The EKS cluster is created successfully using Terraform, but the Ray Serve Cluster remains in a pending state. When I check the describe output, the root error is as follows:
The text was updated successfully, but these errors were encountered: