test: Client-side input shape/element validation #7427

yinggeh · 2024-07-09T01:22:42Z

What does the PR do?

Add client input size check to make sure input shape byte size matches input data byte size.

Checklist

Commit Type:

Check the conventional commit type
box here and add the label to the github PR.

test

Related PRs:

triton-inference-server/client#742

Where should the reviewer start?

Should look at triton-inference-server/client#742 first.

Test plan:

n/a

CI Pipeline ID:
17202351

Caveats:

Shared memory byte size checks for string inputs is not implemented.

Background

Stop malformed input request at client side before sending to the server.

Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)

Relates to #7171

* Fix gRPC test failure and refactor * Add gRPC AsyncIO cancellation tests * Better check if a request is cancelled * Use f-string

* Fixing torch version for vllm

* Switch Jetson model TensorRT models generation to container * Adding missed file * Fix typo * Fix typos * Remove extra spaces * Fix typo

* Ensure notify_state_ gets properly destructed * Fix inflight state tracking to properly erase states * Prevent removing the notify_state from being erased * Wrap notify_state_ object within unique_ptr

* TRTLLM backend post release * TRTLLM backend post release * Update submodule url for permission issue * Update submodule url * Fix up * Not using postbuild function to workaround submodule url permission issue

Co-authored-by: Neelay Shah <[email protected]>

* Minor fix for L0_model_config

* Test with different sizes of CUDA memory pool * Check the server log for error message * Improve debugging * Fix syntax

Co-authored-by: dyastremsky <[email protected]> Co-authored-by: Ryan McCormick <[email protected]>

* Update README and versions for 23.10 branch (#6399) * Cherry-picking vLLM backend changes (#6404) * Update build.py to build vLLM backend (#6394) * Add Python backend when vLLM backend built (#6397) --------- Co-authored-by: dyastremsky <[email protected]> * Add documentation on request cancellation (#6403) (#6407) * Add documentation on request cancellation * Include python backend * Update docs/user_guide/request_cancellation.md * Update docs/user_guide/request_cancellation.md * Update docs/README.md * Update docs/user_guide/request_cancellation.md * Remove inflight term from the main documentation * Address review comments * Fix * Update docs/user_guide/request_cancellation.md * Fix --------- Co-authored-by: Iman Tabrizian <[email protected]> Co-authored-by: Neelay Shah <[email protected]> Co-authored-by: Ryan McCormick <[email protected]> Co-authored-by: Jacky <[email protected]> * Fixes in request cancellation doc (#6409) (#6410) * TRT-LLM backend build changes (#6406) (#6430) * Update url * Debugging * Debugging * Update url * Fix build for TRT-LLM backend * Remove TRTLLM TRT and CUDA versions * Fix up unused var * Fix up dir name * FIx cmake patch * Remove previous TRT version * Install required packages for example models * Remove packages that are only needed for testing * Fixing vllm build (#6433) (#6437) * Fixing torch version for vllm Co-authored-by: Olga Andreeva <[email protected]> * Update TRT-LLM backend url (#6455) (#6460) * TRTLLM backend post release * TRTLLM backend post release * Update submodule url for permission issue * Update submodule url * Fix up * Not using postbuild function to workaround submodule url permission issue * remove redundant lines * Revert "remove redundant lines" This reverts commit 86be7ad. * restore missed lines * Update build.py Co-authored-by: Olga Andreeva <[email protected]> * Update build.py Co-authored-by: Olga Andreeva <[email protected]> --------- Co-authored-by: Tanmay Verma <[email protected]> Co-authored-by: dyastremsky <[email protected]> Co-authored-by: Iman Tabrizian <[email protected]> Co-authored-by: Neelay Shah <[email protected]> Co-authored-by: Ryan McCormick <[email protected]> Co-authored-by: Jacky <[email protected]> Co-authored-by: Kris Hung <[email protected]> Co-authored-by: Katherine Yang <[email protected]> Co-authored-by: Olga Andreeva <[email protected]>

…ycle) (#6490) * Test torch allocator gpu memory usage directly rather than global gpu memory for more consistency

* Add testing backend and test * Add test to build / CI. Minor fix on L0_http * Format. Update backend documentation * Fix up * Address comment * Add negative testing * Fix up

…n from test script. (#6499)

* Use postbuild function * Remove updating submodule url

* Added testing for python_backend autocomplete: optional input and model_transaction_policy

Co-authored-by: Francesco Petrini <[email protected]>

* Fixing L0_io

* Bumped vllm version * Add python-bsed backends testing * Add python-based backends CI * Fix errors * Add vllm backend * Fix pre-commit * Modify test.sh * Remove vllm_opt qa model * Remove vLLM ackend tests * Resolve review comments * Fix pre-commit errors * Update qa/L0_backend_python/python_based_backends/python_based_backends_test.py Co-authored-by: Tanmay Verma <[email protected]> * Remove collect_artifacts_from_subdir function call --------- Co-authored-by: oandreeva-nv <[email protected]> Co-authored-by: Tanmay Verma <[email protected]>

… pairs (similar to gRPC)

* Add boost-filesystem

)

qa/L0_input_validation/test.sh

Dockerfile.QA

rmccorm4

Left a couple questions

…e check (#7444)

chore: PA Migration From Client

…backend_python bls test (#7485)

qa/L0_input_validation/input_validation_test.py

GuanLuo · 2024-08-06T02:35:29Z

qa/L0_input_validation/test.sh

-}
-EOL
-
-cp -r $DATADIR/qa_model_repository/graphdef_object_int32_int32 models/.


Why removed?

Use model "simple_identity" instead to test string inputs.

GuanLuo · 2024-08-06T02:36:50Z

qa/L0_input_validation/input_validation_test.py

+            if client_type == "http":
+                triton_client = tritonhttpclient.InferenceServerClient("localhost:8000")
+            else:
+                triton_client = tritongrpcclient.InferenceServerClient("localhost:8001")
+
+            # Example using BYTES input tensor with utf-8 encoded string that
+            # has an embedded null character.
+            null_chars_array = np.array(
+                ["he\x00llo".encode("utf-8") for i in range(16)], dtype=np.object_
+            )
+            null_char_data = null_chars_array.reshape([1, 16])
+            identity_inference(triton_client, null_char_data, True)  # Using binary data
+            identity_inference(triton_client, null_char_data, False)  # Using JSON data
+
+            # Example using BYTES input tensor with 16 elements, where each
+            # element is a 4-byte binary blob with value 0x00010203. Can use
+            # dtype=np.bytes_ in this case.
+            bytes_data = [b"\x00\x01\x02\x03" for i in range(16)]
+            np_bytes_data = np.array(bytes_data, dtype=np.bytes_)
+            np_bytes_data = np_bytes_data.reshape([1, 16])
+            identity_inference(triton_client, np_bytes_data, True)  # Using binary data
+            identity_inference(triton_client, np_bytes_data, False)  # Using JSON data


What is this testing?

Copied from client/src/python/examples/simple_http_string_infer_client.py. It looks like the example demonstrated two ways of preparing string input data. I'll remove one of them.

rmccorm4 · 2024-08-06T17:33:50Z

qa/L0_input_validation/input_validation_test.py

+            inputs[0].set_shape([2, 8])
+            inputs[1].set_shape([2, 8])
+
+            with self.assertRaises(InferenceServerException) as e:


Suggested change

with self.assertRaises(InferenceServerException) as e:

# If number of elements (volume) is correct but shape is wrong, the core will return an error.

with self.assertRaises(InferenceServerException) as e:

kthui and others added 30 commits October 13, 2023 13:56

Add gRPC AsyncIO request cancellation tests (#6408)

0956f95

* Fix gRPC test failure and refactor * Add gRPC AsyncIO cancellation tests * Better check if a request is cancelled * Use f-string

Fix L0_implicit_state (#6427)

ccbae03

Fixing vllm build (#6433)

c112666

* Fixing torch version for vllm

Switch Jetson model TensorRT models generation to container (#6378)

3aba5f4

* Switch Jetson model TensorRT models generation to container * Adding missed file * Fix typo * Fix typos * Remove extra spaces * Fix typo

Bumped vllm version (#6444)

46f93e9

Adjust test_concurrent_same_model_load_unload_stress (#6436)

cf85998

Adding emergency vllm latest release (#6454)

e29c89b

Fix notify state destruction and inflight states tracking (#6451)

b792c32

* Ensure notify_state_ gets properly destructed * Fix inflight state tracking to properly erase states * Prevent removing the notify_state from being erased * Wrap notify_state_ object within unique_ptr

Update TRT-LLM backend url (#6455)

566facd

* TRTLLM backend post release * TRTLLM backend post release * Update submodule url for permission issue * Update submodule url * Fix up * Not using postbuild function to workaround submodule url permission issue

Added docs on python based backends (#6429)

7e7ee88

Co-authored-by: Neelay Shah <[email protected]>

L0_model_config Fix (#6472)

383850d

* Minor fix for L0_model_config

Add test for Python model parameters (#6452)

8c37608

Test Python BLS with different sizes of CUDA memory pool (#6276)

2f7f396

* Test with different sizes of CUDA memory pool * Check the server log for error message * Improve debugging * Fix syntax

Add documentation for K8s-onprem StartupProbe (#5257)

5d6a60a

Co-authored-by: dyastremsky <[email protected]> Co-authored-by: Ryan McCormick <[email protected]>

Adding structure reference to the new document (#6493)

9f04d6d

Improve L0_backend_python test stability (ensemble / gpu_tensor_lifec…

dee479d

…ycle) (#6490) * Test torch allocator gpu memory usage directly rather than global gpu memory for more consistency

Add L0_generative_sequence test (#6475)

ab4d03a

* Add testing backend and test * Add test to build / CI. Minor fix on L0_http * Format. Update backend documentation * Fix up * Address comment * Add negative testing * Fix up

Downgrade vcpkg version (#6503)

1f8507e

Collecting sub dir artifacts in GitLab yaml. Removing collect functio…

a4286b5

…n from test script. (#6499)

Use post build function for TRT-LLM backend (#6476)

11ac9f0

* Use postbuild function * Remove updating submodule url

Enhanced python_backend autocomplete (#6504)

0d8059b

* Added testing for python_backend autocomplete: optional input and model_transaction_policy

Parse reuse-grpc-port and reuse-http-port as booleans (#6511)

fa8c2b6

Co-authored-by: Francesco Petrini <[email protected]>

Fixing L0_io (#6510)

aa473f1

* Fixing L0_io

Enabling option to restrict access to HTTP APIs based on header value…

e9677ec

… pairs (similar to gRPC)

Upgrade DCGM from 2.4.7 to 3.2.6 (#6515)

659611c

Enhance GCS credentials documentations (#6526)

f1465b9

Test file override outside of model directory (#6516)

2ad2786

* Add boost-filesystem

Update ORT version to 1.16.2 (#6531)

e12d06c

test: Tests for Metrics API enhancement to include error counters (#7423

b263bfc

)

yinggeh mentioned this pull request Jul 23, 2024

refactor: Refactor core input size checks triton-inference-server/core#382

Merged

11 tasks

rmccorm4 reviewed Jul 25, 2024

View reviewed changes

qa/L0_input_validation/test.sh Outdated Show resolved Hide resolved

rmccorm4 reviewed Jul 25, 2024

View reviewed changes

Dockerfile.QA Show resolved Hide resolved

rmccorm4 reviewed Jul 25, 2024

View reviewed changes

pvijayakrish and others added 6 commits July 25, 2024 10:36

Update NGC versions post-24.07 release (#7469)

3421429

[build]: Bumping vllm version to v0.5.3.post1 (#7453)

96ef8a7

ci: Fix shape and reformat free tensor handling in the input byte siz…

f151f8a

…e check (#7444)

chore: PA Migration From Client (#7449)

b8a3629

chore: PA Migration From Client

test: Refactor cpu metrics tests to make L0_metrics more stable (#7476)

5e61a01

test: Add BF16 test for python backend (#7483)

e713208

yinggeh force-pushed the yinggeh-DLIS-6657-client-input-byte-size-check branch from f432d41 to 3863c39 Compare July 31, 2024 02:22

yinggeh requested a review from rmccorm4 July 31, 2024 02:23

rmccorm4 and others added 4 commits July 30, 2024 21:35

test: Improve L0_logging stability (#7486)

3443dd6

ci: Return custom exit code to indicate known shm leak failure in L0_…

839faf7

…backend_python bls test (#7485)

Including 'tritonserver.lib' into final package (#7491)

d4b585d

Workaround with L0_trt_reformat_free by removing shm checks

6dc2a0b

yinggeh force-pushed the yinggeh-DLIS-6657-client-input-byte-size-check branch from 17053e9 to 482409e Compare August 4, 2024 22:56

rmccorm4 changed the title ~~feat: Client input byte size checks~~ test: Client-side input shape/element validation Aug 5, 2024

rmccorm4 reviewed Aug 5, 2024

View reviewed changes

qa/L0_input_validation/input_validation_test.py Show resolved Hide resolved

GuanLuo reviewed Aug 6, 2024

View reviewed changes

yinggeh requested review from rmccorm4 and GuanLuo August 6, 2024 17:24

rmccorm4 reviewed Aug 6, 2024

View reviewed changes

Update tests

48c9b25

yinggeh requested a review from rmccorm4 August 6, 2024 17:48

yinggeh added PR: test Adding missing tests or correcting existing test and removed PR: feat A new feature labels Aug 7, 2024

yinggeh marked this pull request as draft September 18, 2024 18:38

pvijayakrish force-pushed the yinggeh-DLIS-6657-client-input-byte-size-check branch from dff25f4 to 48c9b25 Compare January 15, 2025 17:13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test: Client-side input shape/element validation #7427

test: Client-side input shape/element validation #7427

yinggeh commented Jul 9, 2024 •

edited

Loading

rmccorm4 left a comment

GuanLuo Aug 6, 2024

yinggeh Aug 6, 2024

GuanLuo Aug 6, 2024

yinggeh Aug 6, 2024

rmccorm4 Aug 6, 2024

	with self.assertRaises(InferenceServerException) as e:
	# If number of elements (volume) is correct but shape is wrong, the core will return an error.
	with self.assertRaises(InferenceServerException) as e:

test: Client-side input shape/element validation #7427

Are you sure you want to change the base?

test: Client-side input shape/element validation #7427

Conversation

yinggeh commented Jul 9, 2024 • edited Loading

What does the PR do?

Checklist

Commit Type:

Related PRs:

Where should the reviewer start?

Test plan:

Caveats:

Background

Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)

rmccorm4 left a comment

Choose a reason for hiding this comment

GuanLuo Aug 6, 2024

Choose a reason for hiding this comment

yinggeh Aug 6, 2024

Choose a reason for hiding this comment

GuanLuo Aug 6, 2024

Choose a reason for hiding this comment

yinggeh Aug 6, 2024

Choose a reason for hiding this comment

rmccorm4 Aug 6, 2024

Choose a reason for hiding this comment

yinggeh commented Jul 9, 2024 •

edited

Loading