Run open source AI apps in an auto-scaling Kubernetes cluster.
This blog post describes the motivation for this project. Here is a brief summary of the features, capabilities, and intended use:
- Configure and run applications with various GPU requirements.
- Auto-scale GPU nodes in app-specific node pools.
- User account authentication and authorization powered by
keycloak
andkeycloak-gatekeeper
, including support for OpenID Connect, SAML, SSO, etc. - Individual applications do not need to implement auth. (Requests are routed through gatekeepers, running as sidecar containers.)
- Traefik for reverse proxy and ssl termination.
- Shared object storage, so that users can bring their data to each app.
These are the most frequently used tools in the development environment.
gcloud
kubectl
docker
Additional dependencies will vary based on the specific apps configured to run.
You'll need a GCP project with GKE, Object Storage, Artifact Registry.
Copy ./etc/example/config.ini
to ./etc/live/config.ini
and configure.
Copy ./kubernetes-manifests/examples-secrets/*
to ./kubernetes-manifests/secrets/
and configure each secret. Secrets must be base64 encoded. Some helpers are available in ./scripts
(e.g. ./scripts/write-gatekeeper-doodle-secret.sh
).
A step-by-step log of what I did to set things up on a fresh GCP project is in ./log/up-and-running.md
. A word of caution: As the project changes, this document will not be updated.