Releases: substratusai/kubeai
Releases · substratusai/kubeai
kubeai 0.11.0
What's Changed
- improve caching docs by @samos123 in #295
- Update kubernetes api reference by @samos123 in #290
- Deep Chat integration by @nstogner in #294
- Add gh200 support and model by @happytreees in #300
- update README by @samos123 in #296
- Update README.md by @samos123 in #305
- add llama 3.1 70b fp8 model on 1 x gh200 by @samos123 in #302
- Llama 3.1 70b with pipeline parallelism by @samos123 in #307
- add k8s device plugin / GPU operator values file by @samos123 in #308
- Add Lambda's tutorial and video to the README's table of adopters by @cbrownstein-lambda in #309
- update vllm image for GPU and TPU to v0.6.4.post1 by @samos123 in #310
- add a generic K8s install guide by @samos123 in #312
- LoRA Adapters for vLLM & support for s3, gs, oss for pulling adapters and models (to cache) from buckets by @nstogner in #304
- Add Configure Text Generation Models guide by @samos123 in #313
New Contributors
- @happytreees made their first contribution in #300
- @cbrownstein-lambda made their first contribution in #309
Full Changelog: v0.10.0...v0.11.0
helm-chart-models-0.9.0
A Helm chart for Kubernetes
helm-chart-kubeai-0.9.0
Private Open AI Platform for Kubernetes.
kubeai 0.10.0
What's Changed
- Adding Build WF timeout to address stuck WF's by @Sudhamsh in #281
- Add support for HTTP X-Label-Selector headers to support Multitenancy by @nstogner in #282
- add kubeai metrics service endpoint by @kaiehrhardt in #284
- increase caching e2e test timeout by @samos123 in #288
- Add EKS Installation Guide by @samos123 in #287
- add caching models with EFS guide by @samos123 in #289
New Contributors
- @Sudhamsh made their first contribution in #281
- @kaiehrhardt made their first contribution in #284
Full Changelog: v0.9.0...v0.10.0
helm-chart-models-0.8.0
A Helm chart for Kubernetes
helm-chart-kubeai-0.8.0
Private Open AI Platform for Kubernetes.
kubeai 0.9.0
Highlights
- Autoscaling now works for any engine including Ollama and FasterWhisper
- Add ability to cache models using shared filesystems (Filestore, EFS, etc)
What's Changed
- Autoscale based on KubeAI OpenTelemetry active requests metrics by @nstogner in #261
- add resourceProfiles and 405b on A100 80GB by @samos123 in #264
- Refactor e2e tests by @nstogner in #263
- Add Autoscaler State ConfigMap by @nstogner in #268
- add tpu quota to GKE install guide and use values-gke.yaml by @samos123 in #271
- update vllm images to 0.6.3 by @samos123 in #273
- Shared filesystem caching by @nstogner in #272
- add manual test of vLLM on GPU and TPU by @samos123 in #279
Full Changelog: v0.8.0...v0.9.0
helm-chart-models-0.7.0
A Helm chart for Kubernetes
helm-chart-kubeai-0.7.0
Private Open AI Platform for Kubernetes.
kubeai 0.8.0
What's Changed
- fix huggingface secret helm template issue by @samos123 in #246
- fix #235 utilize standard k8s labels by @samos123 in #240
- Initial TPU support by @nstogner in #249
- Add runtimeClassName as optional field in resource profile by @nstogner in #253
- Add example: python Models client by @nstogner in #255
- add llama 3.1 405b model by @samos123 in #254
- Llama 3.2 11B Instruct vision on 1 x L4 GPU by @samos123 in #258
Full Changelog: v0.7.0...v0.8.0