Skip to content

Commit

Permalink
Document servingruntime constraint introduced by kserve/kserve#3181 (#…
Browse files Browse the repository at this point in the history
…320)

* Document serving runtime constraint introduced by kserve/kserve#3181

Signed-off-by: Sivanantham Chinnaiyan <[email protected]>

* Set content type for predict/explainer curl requests

Signed-off-by: Sivanantham Chinnaiyan <[email protected]>

* Update docs/modelserving/servingruntimes.md

Signed-off-by: Dan Sun <[email protected]>

---------

Signed-off-by: Sivanantham Chinnaiyan <[email protected]>
Signed-off-by: Dan Sun <[email protected]>
Co-authored-by: Dan Sun <[email protected]>
  • Loading branch information
sivanantha321 and yuzisun authored Nov 26, 2023
1 parent f8e777d commit e485542
Show file tree
Hide file tree
Showing 22 changed files with 57 additions and 36 deletions.
2 changes: 1 addition & 1 deletion docs/admin/serverless/kourier_networking/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -130,7 +130,7 @@ Send a prediction request to the InferenceService and check the output.
MODEL_NAME=pmml-demo
INPUT_PATH=@./pmml-input.json
SERVICE_HOSTNAME=$(kubectl get inferenceservice pmml-demo -o jsonpath='{.status.url}' | cut -d "/" -f 3)
curl -v -H "Host: ${SERVICE_HOSTNAME}" http://${INGRESS_HOST}:${INGRESS_PORT}/v1/models/$MODEL_NAME:predict -d $INPUT_PATH
curl -v -H "Host: ${SERVICE_HOSTNAME}" -H "Content-Type: application/json" http://${INGRESS_HOST}:${INGRESS_PORT}/v1/models/$MODEL_NAME:predict -d $INPUT_PATH
```

!!! success "Expected Output"
Expand Down
4 changes: 2 additions & 2 deletions docs/modelserving/explainer/alibi/income/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,7 @@ SERVICE_HOSTNAME=$(kubectl get inferenceservice income -o jsonpath='{.status.url
Test the predictor:

```bash
curl -H "Host: $SERVICE_HOSTNAME" http://${INGRESS_HOST}:${INGRESS_PORT}/v1/models/$MODEL_NAME:predict -d '{"instances":[[39, 7, 1, 1, 1, 1, 4, 1, 2174, 0, 40, 9]]}'
curl -H "Host: $SERVICE_HOSTNAME" -H "Content-Type: application/json" http://${INGRESS_HOST}:${INGRESS_PORT}/v1/models/$MODEL_NAME:predict -d '{"instances":[[39, 7, 1, 1, 1, 1, 4, 1, 2174, 0, 40, 9]]}'
```

You should receive the response showing the prediction is for low salary:
Expand All @@ -71,7 +71,7 @@ You should receive the response showing the prediction is for low salary:
Now lets get an explanation for this:

```bash
curl -H "Host: $SERVICE_HOSTNAME" http://${INGRESS_HOST}:${INGRESS_PORT}/v1/models/$MODEL_NAME:explain -d '{"instances":[[39, 7, 1, 1, 1, 1, 4, 1, 2174, 0, 40, 9]]}'
curl -H "Host: $SERVICE_HOSTNAME" -H "Content-Type: application/json" http://${INGRESS_HOST}:${INGRESS_PORT}/v1/models/$MODEL_NAME:explain -d '{"instances":[[39, 7, 1, 1, 1, 1, 4, 1, 2174, 0, 40, 9]]}'
```

The returned explanation will be like:
Expand Down
8 changes: 4 additions & 4 deletions docs/modelserving/explainer/alibi/moviesentiment/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -56,7 +56,7 @@ SERVICE_HOSTNAME=$(kubectl get inferenceservice moviesentiment -o jsonpath='{.st
Test the predictor on an example sentence:

```bash
curl -H "Host: ${SERVICE_HOSTNAME}" http://${INGRESS_HOST}:${INGRESS_PORT}/v1/models/$MODEL_NAME:predict -d '{"instances":["a visually flashy but narratively opaque and emotionally vapid exercise ."]}'
curl -H "Host: ${SERVICE_HOSTNAME}" -H "Content-Type: application/json" http://${INGRESS_HOST}:${INGRESS_PORT}/v1/models/$MODEL_NAME:predict -d '{"instances":["a visually flashy but narratively opaque and emotionally vapid exercise ."]}'
```

You should receive the response showing negative sentiment:
Expand All @@ -69,7 +69,7 @@ You should receive the response showing negative sentiment:
Test on another sentence:

```bash
curl -H "Host: ${SERVICE_HOSTNAME}" http://${INGRESS_HOST}:${INGRESS_PORT}/v1/models/$MODEL_NAME:predict -d '{"instances":["a touching , sophisticated film that almost seems like a documentary in the way it captures an italian immigrant family on the brink of major changes ."]}'
curl -H "Host: ${SERVICE_HOSTNAME}" -H "Content-Type: application/json" http://${INGRESS_HOST}:${INGRESS_PORT}/v1/models/$MODEL_NAME:predict -d '{"instances":["a touching , sophisticated film that almost seems like a documentary in the way it captures an italian immigrant family on the brink of major changes ."]}'
```

You should receive the response showing positive sentiment:
Expand All @@ -83,7 +83,7 @@ Now lets get an explanation for the first sentence:


```bash
curl -v -H "Host: ${SERVICE_HOSTNAME}" http://${INGRESS_HOST}:${INGRESS_PORT}/v1/models/$MODEL_NAME:explain -d '{"instances":["a visually flashy but narratively opaque and emotionally vapid exercise ."]}'
curl -v -H "Host: ${SERVICE_HOSTNAME}" -H "Content-Type: application/json" http://${INGRESS_HOST}:${INGRESS_PORT}/v1/models/$MODEL_NAME:explain -d '{"instances":["a visually flashy but narratively opaque and emotionally vapid exercise ."]}'
```

!!! success "Expected Output"
Expand Down Expand Up @@ -234,7 +234,7 @@ kubectl create -f moviesentiment2.yaml
and then ask for an explanation:

```bash
curl -H "Host: ${SERVICE_HOSTNAME}" http://${INGRESS_HOST}:${INGRESS_PORT}/v1/models/$MODEL_NAME:explain -d '{"instances":["a visually flashy but narratively opaque and emotionally vapid exercise ."]}'
curl -H "Host: ${SERVICE_HOSTNAME}" -H "Content-Type: application/json" http://${INGRESS_HOST}:${INGRESS_PORT}/v1/models/$MODEL_NAME:explain -d '{"instances":["a visually flashy but narratively opaque and emotionally vapid exercise ."]}'
```

!!! success "Expected Output"
Expand Down
2 changes: 1 addition & 1 deletion docs/modelserving/inference_graph/image_pipeline/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -146,7 +146,7 @@ The first step is to [determine the ingress IP and ports](../../../get_started/f
Now, you can test the inference graph by sending the [cat](cat.json) and [dog image data](dog.json).
```bash
SERVICE_HOSTNAME=$(kubectl get inferencegraph dog-breed-pipeline -o jsonpath='{.status.url}' | cut -d "/" -f 3)
curl -v -H "Host: ${SERVICE_HOSTNAME}" http://${INGRESS_HOST}:${INGRESS_PORT} -d @./cat.json
curl -v -H "Host: ${SERVICE_HOSTNAME}" -H "Content-Type: application/json" http://${INGRESS_HOST}:${INGRESS_PORT} -d @./cat.json
```
!!! success "Expected Output"
```{ .json .no-copy }
Expand Down
4 changes: 2 additions & 2 deletions docs/modelserving/logger/logger.md
Original file line number Diff line number Diff line change
Expand Up @@ -77,7 +77,7 @@ MODEL_NAME=sklearn-iris
INPUT_PATH=@./input.json
SERVICE_HOSTNAME=$(kubectl get inferenceservice sklearn-iris -o jsonpath='{.status.url}' | cut -d "/" -f 3)

curl -v -H "Host: ${SERVICE_HOSTNAME}" http://${INGRESS_HOST}:${INGRESS_PORT}/v1/models/$MODEL_NAME:predict -d $INPUT_PATH
curl -v -H "Host: ${SERVICE_HOSTNAME}" -H "Content-Type: application/json" http://${INGRESS_HOST}:${INGRESS_PORT}/v1/models/$MODEL_NAME:predict -d $INPUT_PATH
```

!!! success "Expected Output"
Expand Down Expand Up @@ -283,7 +283,7 @@ MODEL_NAME=sklearn-iris
INPUT_PATH=@./input.json
SERVICE_HOSTNAME=$(kubectl get inferenceservice sklearn-iris -o jsonpath='{.status.url}' | cut -d "/" -f 3)

curl -v -H "Host: ${SERVICE_HOSTNAME}" http://${INGRESS_HOST}:${INGRESS_PORT}/v1/models/$MODEL_NAME:predict -d $INPUT_PATH
curl -v -H "Host: ${SERVICE_HOSTNAME}" -H "Content-Type: application/json" http://${INGRESS_HOST}:${INGRESS_PORT}/v1/models/$MODEL_NAME:predict -d $INPUT_PATH
```

!!! success "Expected Output"
Expand Down
21 changes: 21 additions & 0 deletions docs/modelserving/servingruntimes.md
Original file line number Diff line number Diff line change
Expand Up @@ -230,6 +230,27 @@ will be used for model deployment.
- The serving runtime with priority takes precedence over the serving runtime with priority not specified.
- Two model formats with same name and same model version cannot have the same priority.
- If more than one serving runtime supports the model format and none of them specified the priority then, there is no guarantee _which_ runtime will be selected.
- If multiple versions of a modelFormat are supported by a serving runtime, then it should have the same priority.
For example, Below shown serving runtime supports two versions of sklearn. It should have the same priority.
```yaml
apiVersion: serving.kserve.io/v1alpha1
kind: ClusterServingRuntime
metadata:
name: mlserver
spec:
protocolVersions:
- v2
supportedModelFormats:
- name: sklearn
version: "0"
autoSelect: true
priority: 2
- name: sklearn
version: "1"
autoSelect: true
priority: 2
...
```

!!! warning
If multiple runtimes list the same format and/or version as auto-selectable and the priority is not specified, the runtime is selected based on the `creationTimestamp` i.e. the most recently created runtime is selected. So there is no guarantee _which_ runtime will be selected.
Expand Down
2 changes: 1 addition & 1 deletion docs/modelserving/storage/azure/azure.md
Original file line number Diff line number Diff line change
Expand Up @@ -108,7 +108,7 @@ SERVICE_HOSTNAME=$(kubectl get inferenceservice sklearn-azure -o jsonpath='{.sta

MODEL_NAME=sklearn-azure
INPUT_PATH=@./input.json
curl -v -H "Host: ${SERVICE_HOSTNAME}" http://${INGRESS_HOST}:${INGRESS_PORT}/v1/models/$MODEL_NAME:predict -d $INPUT_PATH
curl -v -H "Host: ${SERVICE_HOSTNAME}" -H "Content-Type: application/json" http://${INGRESS_HOST}:${INGRESS_PORT}/v1/models/$MODEL_NAME:predict -d $INPUT_PATH
```

!!! success "Expected Output"
Expand Down
2 changes: 1 addition & 1 deletion docs/modelserving/storage/pvc/pvc.md
Original file line number Diff line number Diff line change
Expand Up @@ -134,7 +134,7 @@ SERVICE_HOSTNAME=$(kubectl get inferenceservice sklearn-pvc -o jsonpath='{.statu

MODEL_NAME=sklearn-pvc
INPUT_PATH=@./input.json
curl -v -H "Host: ${SERVICE_HOSTNAME}" http://${INGRESS_HOST}:${INGRESS_PORT}/v1/models/$MODEL_NAME:predict -d $INPUT_PATH
curl -v -H "Host: ${SERVICE_HOSTNAME}" -H "Content-Type: application/json" http://${INGRESS_HOST}:${INGRESS_PORT}/v1/models/$MODEL_NAME:predict -d $INPUT_PATH
```

!!! success "Expected Output"
Expand Down
2 changes: 1 addition & 1 deletion docs/modelserving/storage/s3/s3.md
Original file line number Diff line number Diff line change
Expand Up @@ -131,7 +131,7 @@ SERVICE_HOSTNAME=$(kubectl get inferenceservice mnist-s3 -o jsonpath='{.status.u

MODEL_NAME=mnist-s3
INPUT_PATH=@./input.json
curl -v -H "Host: ${SERVICE_HOSTNAME}" http://${INGRESS_HOST}:${INGRESS_PORT}/v1/models/$MODEL_NAME:predict -d $INPUT_PATH
curl -v -H "Host: ${SERVICE_HOSTNAME}" -H "Content-Type: application/json" http://${INGRESS_HOST}:${INGRESS_PORT}/v1/models/$MODEL_NAME:predict -d $INPUT_PATH
```

!!! success "Expected Output"
Expand Down
4 changes: 2 additions & 2 deletions docs/modelserving/storage/uri/uri.md
Original file line number Diff line number Diff line change
Expand Up @@ -137,7 +137,7 @@ SERVICE_HOSTNAME=$(kubectl get inferenceservice sklearn-from-uri -o jsonpath='{.

MODEL_NAME=sklearn-from-uri
INPUT_PATH=@./input.json
curl -v -H "Host: ${SERVICE_HOSTNAME}" http://${INGRESS_HOST}:${INGRESS_PORT}/v1/models/$MODEL_NAME:predict -d $INPUT_PATH
curl -v -H "Host: ${SERVICE_HOSTNAME}" -H "Content-Type: application/json" http://${INGRESS_HOST}:${INGRESS_PORT}/v1/models/$MODEL_NAME:predict -d $INPUT_PATH
```

!!! success "Expected Output"
Expand Down Expand Up @@ -265,7 +265,7 @@ SERVICE_HOSTNAME=$(kubectl get inferenceservice tensorflow-from-uri-gzip -o json

MODEL_NAME=tensorflow-from-uri-gzip
INPUT_PATH=@./input.json
curl -v -H "Host: ${SERVICE_HOSTNAME}" http://${INGRESS_HOST}:${INGRESS_PORT}/v1/models/$MODEL_NAME:predict -d $INPUT_PATH
curl -v -H "Host: ${SERVICE_HOSTNAME}" -H "Content-Type: application/json" http://${INGRESS_HOST}:${INGRESS_PORT}/v1/models/$MODEL_NAME:predict -d $INPUT_PATH
```

!!! success "Expected Output"
Expand Down
2 changes: 1 addition & 1 deletion docs/modelserving/v1beta1/amd/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -100,7 +100,7 @@ Assuming that `INGRESS_HOST`, `INGRESS_PORT`, and `SERVICE_HOSTNAME` have been d
```bash
export MODEL_NAME=mnist
export INPUT_DATA=@./input.json
curl -v -H "Host: ${SERVICE_HOSTNAME}" http://${INGRESS_HOST}:${INGRESS_PORT}/v2/models/${MODEL_NAME}/infer -d ${INPUT_DATA}
curl -v -H "Host: ${SERVICE_HOSTNAME}" -H "Content-Type: application/json" http://${INGRESS_HOST}:${INGRESS_PORT}/v2/models/${MODEL_NAME}/infer -d ${INPUT_DATA}
```

This shows the response from the server in KServe's v2 API format.
Expand Down
4 changes: 2 additions & 2 deletions docs/modelserving/v1beta1/custom/custom_model/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -81,7 +81,7 @@ docker run -ePORT=8080 -p8080:8080 ${DOCKER_USER}/custom-model:v1

Send a test inference request locally with [input.json](./input.json)
```bash
curl localhost:8080/v1/models/custom-model:predict -d @./input.json
curl -H "Content-Type: application/json" localhost:8080/v1/models/custom-model:predict -d @./input.json
```
!!! success "Expected Output"
```{ .json .no-copy }
Expand Down Expand Up @@ -146,7 +146,7 @@ MODEL_NAME=custom-model
INPUT_PATH=@./input.json
SERVICE_HOSTNAME=$(kubectl get inferenceservice ${MODEL_NAME} -o jsonpath='{.status.url}' | cut -d "/" -f 3)
curl -v -H "Host: ${SERVICE_HOSTNAME}" http://${INGRESS_HOST}:${INGRESS_PORT}/v1/models/${MODEL_NAME}:predict -d $INPUT_PATH
curl -v -H "Host: ${SERVICE_HOSTNAME}" -H "Content-Type: application/json" http://${INGRESS_HOST}:${INGRESS_PORT}/v1/models/${MODEL_NAME}:predict -d $INPUT_PATH
```

!!! success "Expected Output"
Expand Down
2 changes: 1 addition & 1 deletion docs/modelserving/v1beta1/lightgbm/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -101,7 +101,7 @@ To test the deployed model the first step is to [determine the ingress IP and po
MODEL_NAME=lightgbm-iris
INPUT_PATH=@./iris-input.json
SERVICE_HOSTNAME=$(kubectl get inferenceservice lightgbm-iris -o jsonpath='{.status.url}' | cut -d "/" -f 3)
curl -v -H "Host: ${SERVICE_HOSTNAME}" http://${INGRESS_HOST}:${INGRESS_PORT}/v1/models/$MODEL_NAME:predict -d $INPUT_PATH
curl -v -H "Host: ${SERVICE_HOSTNAME}" -H "Content-Type: application/json" http://${INGRESS_HOST}:${INGRESS_PORT}/v1/models/$MODEL_NAME:predict -d $INPUT_PATH
```

!!! success "Expected Output"
Expand Down
2 changes: 1 addition & 1 deletion docs/modelserving/v1beta1/pmml/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -63,7 +63,7 @@ You can see an example payload below. Create a file named `iris-input.json` with
MODEL_NAME=pmml-demo
INPUT_PATH=@./iris-input.json
SERVICE_HOSTNAME=$(kubectl get inferenceservice pmml-demo -o jsonpath='{.status.url}' | cut -d "/" -f 3)
curl -v -H "Host: ${SERVICE_HOSTNAME}" http://${INGRESS_HOST}:${INGRESS_PORT}/v1/models/$MODEL_NAME:predict -d $INPUT_PATH
curl -v -H "Host: ${SERVICE_HOSTNAME}" -H "Content-Type: application/json" http://${INGRESS_HOST}:${INGRESS_PORT}/v1/models/$MODEL_NAME:predict -d $INPUT_PATH
```

!!! success "Expected Output"
Expand Down
4 changes: 2 additions & 2 deletions docs/modelserving/v1beta1/rollout/canary-example.md
Original file line number Diff line number Diff line change
Expand Up @@ -249,13 +249,13 @@ MODEL_NAME=sklearn-iris
curl the latest revision

```bash
curl -v -H "Host: latest-${MODEL_NAME}-predictor-default.kserve-test.example.com" http://${INGRESS_HOST}:${INGRESS_PORT}/v1/models/$MODEL_NAME:predict -d @./iris-input.json
curl -v -H "Host: latest-${MODEL_NAME}-predictor-default.kserve-test.example.com" -H "Content-Type: application/json" http://${INGRESS_HOST}:${INGRESS_PORT}/v1/models/$MODEL_NAME:predict -d @./iris-input.json

```

or curl the previous revision

```bash
curl -v -H "Host: prev-${MODEL_NAME}-predictor-default.kserve-test.example.com" http://${INGRESS_HOST}:${INGRESS_PORT}/v1/models/$MODEL_NAME:predict -d @./iris-input.json
curl -v -H "Host: prev-${MODEL_NAME}-predictor-default.kserve-test.example.com" -H "Content-Type: application/json" http://${INGRESS_HOST}:${INGRESS_PORT}/v1/models/$MODEL_NAME:predict -d @./iris-input.json

```
2 changes: 1 addition & 1 deletion docs/modelserving/v1beta1/spark/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -109,7 +109,7 @@ You can see an example payload below. Create a file named `iris-input.json` with
MODEL_NAME=spark-pmml
INPUT_PATH=@./iris-input.json
SERVICE_HOSTNAME=$(kubectl get inferenceservice spark-pmml -o jsonpath='{.status.url}' | cut -d "/" -f 3)
curl -v -H "Host: ${SERVICE_HOSTNAME}" http://${INGRESS_HOST}:${INGRESS_PORT}/v1/models/$MODEL_NAME:predict -d $INPUT_PATH
curl -v -H "Host: ${SERVICE_HOSTNAME}" -H "Content-Type: application/json" http://${INGRESS_HOST}:${INGRESS_PORT}/v1/models/$MODEL_NAME:predict -d $INPUT_PATH
```

!!! success "Expected Output"
Expand Down
2 changes: 1 addition & 1 deletion docs/modelserving/v1beta1/tensorflow/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -60,7 +60,7 @@ file can be downloaded [here](./input.json).
MODEL_NAME=flower-sample
INPUT_PATH=@./input.json
SERVICE_HOSTNAME=$(kubectl get inferenceservice ${MODEL_NAME} -o jsonpath='{.status.url}' | cut -d "/" -f 3)
curl -v -H "Host: ${SERVICE_HOSTNAME}" http://${INGRESS_HOST}:${INGRESS_PORT}/v1/models/$MODEL_NAME:predict -d $INPUT_PATH
curl -v -H "Host: ${SERVICE_HOSTNAME}" -H "Content-Type: application/json" http://${INGRESS_HOST}:${INGRESS_PORT}/v1/models/$MODEL_NAME:predict -d $INPUT_PATH
```

!!! success "Expected Output"
Expand Down
12 changes: 6 additions & 6 deletions docs/modelserving/v1beta1/torchserve/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -158,7 +158,7 @@ SERVICE_HOSTNAME=$(kubectl get inferenceservice torchserve -o jsonpath='{.status
You can use [image converter](https://github.com/kserve/kserve/tree/master/docs/samples/v1beta1/torchserve/v1/imgconv) to convert the images to base64 byte array, for other models please refer to [input request](https://github.com/pytorch/serve/tree/master/kubernetes/kserve/kf_request_json).

```bash
curl -v -H "Host: ${SERVICE_HOSTNAME}" http://${INGRESS_HOST}:${INGRESS_PORT}/v1/models/${MODEL_NAME}:predict -d @./mnist.json
curl -v -H "Host: ${SERVICE_HOSTNAME}" -H "Content-Type: application/json" http://${INGRESS_HOST}:${INGRESS_PORT}/v1/models/${MODEL_NAME}:predict -d @./mnist.json
```

!!! success "Expected Output"
Expand Down Expand Up @@ -194,7 +194,7 @@ curl -v -H "Host: ${SERVICE_HOSTNAME}" http://${INGRESS_HOST}:${INGRESS_PORT}/v1
To get model explanation:

```bash
curl -v -H "Host: ${SERVICE_HOSTNAME}" http://${INGRESS_HOST}:${INGRESS_PORT}/v1/models/mnist:explain -d @./mnist.json
curl -v -H "Host: ${SERVICE_HOSTNAME}" -H "Content-Type: application/json" http://${INGRESS_HOST}:${INGRESS_PORT}/v1/models/mnist:explain -d @./mnist.json
```

!!! success "Expected Output"
Expand Down Expand Up @@ -261,7 +261,7 @@ SERVICE_HOSTNAME=$(kubectl get inferenceservice torchserve-mnist-v2 -o jsonpath=
You can send both **byte array** and **tensor** with v2 protocol, for byte array use [image converter](https://github.com/kserve/kserve/tree/master/docs/samples/v1beta1/torchserve/v2/bytes_conv) to convert the image to byte array input. Here we use the [mnist_v2_bytes.json](./mnist_v2_bytes.json) file to run an example inference.

```bash
curl -v -H "Host: ${SERVICE_HOSTNAME}" http://${INGRESS_HOST}:${INGRESS_PORT}/v2/models/${MODEL_NAME}/infer -d @./mnist_v2_bytes.json
curl -v -H "Host: ${SERVICE_HOSTNAME}" -H "Content-Type: application/json" http://${INGRESS_HOST}:${INGRESS_PORT}/v2/models/${MODEL_NAME}/infer -d @./mnist_v2_bytes.json
```

!!! success "Expected Output"
Expand All @@ -273,7 +273,7 @@ For tensor input use the [tensor image converter](https://github.com/kserve/kser
tensor input and here we use the [mnist_v2.json](./mnist_v2.json) file to run an example inference.

```bash
curl -v -H "Host: ${SERVICE_HOSTNAME}" http://${INGRESS_HOST}:${INGRESS_PORT}/v2/models/${MODEL_NAME}/infer -d @./mnist_v2.json
curl -v -H "Host: ${SERVICE_HOSTNAME}" -H "Content-Type: application/json" http://${INGRESS_HOST}:${INGRESS_PORT}/v2/models/${MODEL_NAME}/infer -d @./mnist_v2.json
```

!!! success "Expected Output"
Expand All @@ -287,7 +287,7 @@ To get the model explanation with v2 explain endpoint:

```bash
MODEL_NAME=mnist
curl -v -H "Host: ${SERVICE_HOSTNAME}" http://${INGRESS_HOST}:${INGRESS_PORT}/v2/models/mnist/explain -d @./mnist_v2.json
curl -v -H "Host: ${SERVICE_HOSTNAME}" -H "Content-Type: application/json" http://${INGRESS_HOST}:${INGRESS_PORT}/v2/models/mnist/explain -d @./mnist_v2.json
```

!!! success "Expected Output"
Expand Down Expand Up @@ -564,7 +564,7 @@ kubectl patch isvc torchserve --type='json' -p '[{"op": "replace", "path": "/spe
```

```bash
curl -v -H "Host: latest-torchserve-predictor-default.default.example.com" http://${INGRESS_HOST}:${INGRESS_PORT}/v1/models/${MODEL_NAME}:predict -d @./mnist.json
curl -v -H "Host: latest-torchserve-predictor-default.default.example.com" -H "Content-Type: application/json" http://${INGRESS_HOST}:${INGRESS_PORT}/v1/models/${MODEL_NAME}:predict -d @./mnist.json
```

!!! success "Expected Output"
Expand Down
2 changes: 1 addition & 1 deletion docs/modelserving/v1beta1/transformer/feast/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -275,7 +275,7 @@ MODEL_NAME=sklearn-driver-transformer
INPUT_PATH=@./driver-input.json
SERVICE_HOSTNAME=$(kubectl get inferenceservice $SERVICE_NAME -o jsonpath='{.status.url}' | cut -d "/" -f 3)

curl -v -H "Host: ${SERVICE_HOSTNAME}" -d $INPUT_PATH http://${INGRESS_HOST}:${INGRESS_PORT}/v1/models/$MODEL_NAME:predict
curl -v -H "Host: ${SERVICE_HOSTNAME}" -H "Content-Type: application/json" -d $INPUT_PATH http://${INGRESS_HOST}:${INGRESS_PORT}/v1/models/$MODEL_NAME:predict
```

!!! success "Expected output"
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -182,7 +182,7 @@ MODEL_NAME=mnist
INPUT_PATH=@./input.json
SERVICE_HOSTNAME=$(kubectl get inferenceservice $SERVICE_NAME -o jsonpath='{.status.url}' | cut -d "/" -f 3)

curl -v -H "Host: ${SERVICE_HOSTNAME}" -d $INPUT_PATH http://${INGRESS_HOST}:${INGRESS_PORT}/v1/models/$MODEL_NAME:predict
curl -v -H "Host: ${SERVICE_HOSTNAME}" -H "Content-Type: application/json" -d $INPUT_PATH http://${INGRESS_HOST}:${INGRESS_PORT}/v1/models/$MODEL_NAME:predict
```

!!! success "Expected Output"
Expand Down Expand Up @@ -306,7 +306,7 @@ MODEL_NAME=cifar10
INPUT_PATH=@./image.json
SERVICE_HOSTNAME=$(kubectl get inferenceservice $SERVICE_NAME -o jsonpath='{.status.url}' | cut -d "/" -f 3)

curl -v -H "Host: ${SERVICE_HOSTNAME}" -d $INPUT_PATH http://${INGRESS_HOST}:${INGRESS_PORT}/v1/models/$MODEL_NAME:predict
curl -v -H "Host: ${SERVICE_HOSTNAME}" -H "Content-Type: application/json" -d $INPUT_PATH http://${INGRESS_HOST}:${INGRESS_PORT}/v1/models/$MODEL_NAME:predict
```

!!! success "Expected Output"
Expand Down
2 changes: 1 addition & 1 deletion docs/modelserving/v1beta1/triton/bert/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -170,7 +170,7 @@ MODEL_NAME=bert-v2
INPUT_PATH=@./input.json
SERVICE_HOSTNAME=$(kubectl get inferenceservices bert-v2 -o jsonpath='{.status.url}' | cut -d "/" -f 3)
curl -v -H "Host: ${SERVICE_HOSTNAME}" -d $INPUT_PATH http://${INGRESS_HOST}:${INGRESS_PORT}/v1/models/$MODEL_NAME:predict
curl -v -H "Host: ${SERVICE_HOSTNAME}" -H "Content-Type: application/json" -d $INPUT_PATH http://${INGRESS_HOST}:${INGRESS_PORT}/v1/models/$MODEL_NAME:predict
```

!!! success "Expected output"
Expand Down
Loading

0 comments on commit e485542

Please sign in to comment.