Autoscale Sample

A demonstration of the autoscaling capabilities of a Knative Serving Revision.

Prerequisites

A Kubernetes cluster with Knative Serving installed.
A metrics installation for viewing scaling graphs (optional).
Install Docker.
Check out the code:

go get -d github.com/knative/docs/serving/samples/autoscale-go

Setup

Build the application container and publish it to a container registry:

Move into the sample directory:
```
cd $GOPATH/src/github.com/knative/docs
```
Set your preferred container registry:
```
export REPO="gcr.io/<YOUR_PROJECT_ID>"
```
- This example shows how to use Google Container Registry (GCR). You will need a Google Cloud Project and to enable the Google Container Registry API.

Use Docker to build your application container:

docker build \
  --tag "${REPO}/serving/samples/autoscale-go" \
  --file=serving/samples/autoscale-go/Dockerfile .

Push your container to a container registry:

docker push "${REPO}/serving/samples/autoscale-go"

Replace the image reference with our published image:

perl -pi -e \
"s@github.com/knative/docs/serving/samples/autoscale-go@${REPO}/serving/samples/autoscale-go@g" \
serving/samples/autoscale-go/service.yaml

Deploy the Service

Deploy the Knative Serving sample:

kubectl apply -f serving/samples/autoscale-go/service.yaml

Find the ingress hostname and IP and export as an environment variable:

export IP_ADDRESS=`kubectl get svc knative-ingressgateway -n istio-system -o jsonpath="{.status.loadBalancer.ingress[*].ip}"`

View the Autoscaling Capabilities

Make a request to the autoscale app to see it consume some resources.

curl --header "Host: autoscale-go.default.example.com" "http://${IP_ADDRESS?}?sleep=100&prime=1000000&bloat=50"

Allocated 50 Mb of memory.
The largest prime less than 1000000 is 999983.
Slept for 100.13 milliseconds.

Ramp up traffic to maintain 10 in-flight requests.

go run serving/samples/autoscale-go/test/test.go -sleep 100 -prime 1000000 -bloat 50 -qps 9999 -concurrency 10

REQUEST STATS:
Total: 34       Inflight: 10    Done: 34        Success Rate: 100.00%   Avg Latency: 0.2584 sec
Total: 69       Inflight: 10    Done: 35        Success Rate: 100.00%   Avg Latency: 0.2750 sec
Total: 108      Inflight: 10    Done: 39        Success Rate: 100.00%   Avg Latency: 0.2598 sec
Total: 148      Inflight: 10    Done: 40        Success Rate: 100.00%   Avg Latency: 0.2565 sec
Total: 185      Inflight: 10    Done: 37        Success Rate: 100.00%   Avg Latency: 0.2624 sec
...

Note: Use CTRL+C to exit the load test.

Watch the Knative Serving deployment pod count increase.
```
kubectl get deploy --watch
```
Note: Use CTRL+C to exit watch mode.

Analysis

Algorithm

Knative Serving autoscaling is based on the average number of in-flight requests per pod (concurrency). The system has a default target concurency of 1.0.

For example, if a Revision is receiving 35 requests per second, each of which takes about about .25 seconds, Knative Serving will determine the Revision needs about 9 pods

35 * .25 = 8.75
ceil(8.75) = 9

Dashboards

View the Knative Serving Scaling and Request dashboards (if configured).

kubectl port-forward -n monitoring $(kubectl get pods -n monitoring --selector=app=grafana --output=jsonpath="{.items..metadata.name}") 3000

Other Experiments

Maintain 100 concurrent requests.

go run serving/samples/autoscale-go/test/test.go -qps 9999 -concurrency 100

Maintain 100 qps with fast requests.

go run serving/samples/autoscale-go/test/test.go -qps 100 -concurrency 9999

Maintain 100 qps with slow requests.

go run serving/samples/autoscale-go/test/test.go -qps 100 -concurrency 9999 -sleep 500

Heavy CPU usage.

go run serving/samples/autoscale-go/test/test.go -qps 9999 -concurrency 10 -prime 40000000

Heavy memory usage.

go run serving/samples/autoscale-go/test/test.go -qps 9999 -concurrency 5 -bloat 1000

Cleanup

kubectl delete -f serving/samples/autoscale-go/service.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Autoscale Sample

Prerequisites

Setup

Deploy the Service

View the Autoscaling Capabilities

Analysis

Algorithm

Dashboards

Other Experiments

Cleanup

Further reading

Files

README.md

Latest commit

History

README.md

File metadata and controls

Autoscale Sample

Prerequisites

Setup

Deploy the Service

View the Autoscaling Capabilities

Analysis

Algorithm

Dashboards

Other Experiments

Cleanup

Further reading