| Architecture | Prerequisites | Getting Started | Pipeline Requests | Uninstall | Learn more |
Kubernetes Pipeline Server allows users to deploy multiple instances of Pipeline Server by routing requests to manage workload distribution. It supports processing media analytics on CPU and/or GPU, visual output via RTSP or WebRTC, and utilizes Persistent Volumes for model storage via NFS.
Kubernetes | To run this deployment, access to a Kubernetes cluster is required. Instructions for installing a Kubernetes cluster can be found here. |
Helm | This deployment uses Helm as the package manager to ship Pipeline Server. Instructions to install Helm can be found here |
Step 1: If you have GPU capable nodes, install Intel GPU Plugin and Node Feature Discovery (NFD) to enable GPU in your cluster.
Note: Currently we are supporting the use of manual scripts to install Intel GPU Plugin, as the support for Helm is not released yet.
You can set the number of shared devices
this determines the number of containers that can share the same GPU device, leaving this blank will default to 2
.
./samples/kubernetes/dependencies/deploy-gpu-plugin.sh <number of shared devices>
Step 2: Build the dependencies from helm
cd samples/kubernetes
helm dep up
Step 3: Install pipeline-server into your cluster using default values.yaml. In this example, dlstreamer
is the release name that is used, you can change this value to any desired value to denote the release name for this deployment. We will store our release name inside RELEASE_NAME
variable as we will need this in the later steps.
RELEASE_NAME=dlstreamer
cd samples/kubernetes
helm install $RELEASE_NAME .
Once pods have been deployed, clients can send pipeline server requests to the cluster via HAProxy (Ingress Controller). HAProxy is currently set with round-robin
algorithm.
HAProxy is used to route the HTTP request to the appropriate pods. Use kubectl
to either port-forward port 80
or set HAProxy to NodePort (NodePort is the default). In this example, we will be forwarding port 80
to port 8080
.
export POD_NAME=$(kubectl get pods --namespace default -l "app.kubernetes.io/name=haproxy,app.kubernetes.io/instance=$RELEASE_NAME" -o jsonpath="{.items[0].metadata.name}")
export CONTAINER_PORT=$(kubectl get pod --namespace default $POD_NAME -o jsonpath="{.spec.containers[0].ports[0].containerPort}")
kubectl --namespace default port-forward $POD_NAME 8080:$CONTAINER_PORT
As an example, the following curl
request starts processing the homes_00425.mkv
media file with the object_detection/person_vehicle_bike
pipeline.
This command can be issued multiple times to start multiple concurrent pipelines on the cluster. Open a new terminal and run this commands.
In this example, we are using
localhost
because we have port forwarded it to localhost in Step 1.
curl http://localhost:8080/pipelines/object_detection/person_vehicle_bike -X POST -H \
'Content-Type: application/json' -d \
'{
"source": {
"uri": "https://lvamedia.blob.core.windows.net/public/homes_00425.mkv",
"type": "uri"
}
}'
b85bcc1c4ae711ed8c79aa43cc2acc79
To check the pipeline you can use /pipelines/status
which queries the status of all running pipelines in the cluster.
curl http://localhost:8080/pipelines/status
[
{
"avg_fps": 27.59946173763987,
"elapsed_time": 1.4492945671081543,
"id": "b85bcc1c4ae711ed8c79aa43cc2acc79",
"message": "",
"start_time": 1665659462.319232,
"state": "RUNNING"
}
]
The Kubernetes cluster is compatible with the Pipeline Server REST API, see examples in the Running a Pipeline section in README.md. The REST request will be routed to nodes using the round-robin
algorithm.
When specifying an inference device see Change Inference Accelerator Device, the request will be routed to the Node with that capability.
Step 1: Uninstall the deployment in helm
helm uninstall dlstreamer
Step 2: Uninstall Intel GPU Plugin and NFD
./samples/kubernetes/dependencies/remove-gpu-plugin.sh
There are various examples and documentation under examples and docs to help understand how Kubernetes can work with Pipeline Server.
Examples & Tutorials | Definition |
---|---|
Sharing Models, Pipelines & Extensions between Pods | Tutorial of using Persistent Volume to share models, pipelines and extensions between Pods |
Visualizing Inference Output | Tutorial to view the inference output from pipelines |
Values.yaml | Documentation to explain about values.yaml |
Securing Kubernetes with HTTPS | Demo on running Kubernetes with HTTPS |
Stream Density example | Running various streams using Pipeline Client |