Skip to content

zet809/triton-server-learning

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 

Repository files navigation

triton-server-learning

Trition Inference Server provided by NVIDIA, it provide good example and reference for GPU coding, Tensorrt building.

This repo is used to record my understanding and questions when reading the source code.

To load ensemble example models:

  • start tritonserver build container (my build for r20.07)
# nvidia-docker  run -e NVIDIA_VISIBLE_DEVICES=0 --privileged -it -p8000:8000 -p8001:8001 -p8002:8002 -v $(pwd)/src:/workspace/src tritonserver_build:r20.07 bash
  • download example models
# cd /workspace/docs/examples/
# ./fetch_models.sh
# mv model_repository/resnet50_netdef/1/ ensemble_model_repository/resnet50_netdef/
apt-get update && apt-get install libopencv-highgui-dev
  • create version directory for ensemble model

  • the working directory looks like this:

# tree ./ensemble_model_repository/
./ensemble_model_repository/
|-- image_preprocess_nchw_3x224x224_inception
|   |-- 1
|   |   `-- libimagepreprocess.so
|   `-- config.pbtxt
|-- preprocess_resnet50_ensemble
|   |-- 1
|   `-- config.pbtxt
`-- resnet50_netdef
    |-- 1
    |   |-- init_model.netdef
    |   `-- model.netdef
    |-- config.pbtxt
    `-- resnet50_labels.txt

6 directories, 7 files
  • start triton server
# /opt/tritonserver/bin/tritonserver --model-repository=./ensemble_model_repository/

To do inference with CURL for triton-inference-server url: v2/models/<model_name>/infer

  • prepend image file size 1005970(0x0f5992)
# printf "\x00\x0f\x59\x92" | cat - ./images/mug.jpg > stuff_mug
# ls -l stuff_mug
-rw-r--r-- 1 root root 1005974 Feb  3 01:35 stuff_mug
  • prepare postdata json format, and append image size and image data to it:
# cat postdata.json
{"inputs":[{"name":"INPUT","shape":[1,1],"datatype":"BYTES","parameters":{"binary_data_size":<size of file stuff_mug, 1005974 here>}}],"outputs":[{"name":"OUTPUT","parameters":{"classification":3,"binary_data":true}}]}
# ls -l postdata.json
-rw-r--r-- 1 root root 188 Feb  3 01:35 postdata.json
# printf "\x00\x0f\x59\x92" | cat - ./images/mug.jpg >> postdata.json
# ls -l postdata.json
-rw-r--r-- 1 root root 1006161 Jan  6 05:50 postdata.json
  • do inference
# curl -X POST -H "Content-Type: application/octet-stream" -H "Inference-Header-Content-Length: <sizeof original postdata.json, 188 here>" -H "Content-Length: <size of final postdata.json, 1006161 here>" -H "Accept: */*" localhost:8000/v2/models/<model_name>/infer --data-binary "@postdata.json" -vv -o /workspace/myoutput
# cat /workspace/myoutput
{"model_name":"<model_name>","model_version":"1","outputs":[{"name":"OUTPUT","datatype":"BYTES","shape":[1,3],"parameters":{"binary_data_size":72}}]}0.723992:504:COFFEE MUG0.270952:968:CUP0.001160:967:ESPRESSO#

To generate non binary JSON file for inference

  • For generate non-binary JSON file, pls reference python script
  • do inference
# curl --location -X POST 'http://127.0.0.1:8000/v2/models/resnet50_netdef/versions/1/infer' -d @mypostdata.json
{"model_name":"resnet50_netdef","model_version":"1","outputs":[{"name":"gpu_0/softmax","datatype":"BYTES","shape":[1,3],"data":["0.686651:504:COFFEE MUG","0.308505:968:CUP","0.001335:505:COFFEEPOT"]}]}

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages