Add setup instructions for TensorRT-LLM #789

linden-li · 2023-12-07T23:47:28Z

No description provided.

dakinggg

LGTM, will leave approval for Daya or Megha

scripts/inference/README.md

dakinggg · 2023-12-08T00:25:12Z

scripts/inference/README.md

+1. Convert an MPT HuggingFace checkpoint into the FasterTransformer format.
+2. Build a TensorRT engine with the FasterTransformer weights
+
+Using this engine, you can utilize TensorRT-LLM for fast inference. If you would like to use TensorRT-LLM as an end-to-end solution for an inference service, you can utilize the built engine with an NVIDIA Triton server backend: an example server can be found in [this repository](https://github.com/triton-inference-server/tensorrtllm_backend/tree/v0.6.1) accompanying the most recent release.


was "built engine" supposed to be "built-in engine"?

I'd rephrase it as "built TRT engine". Also, here again we should drop "most recent release" as suggested by Daniel above.

@linden-li can you pls make the suggested changes here? also, update TRT LLM link to v0.7.1?

scripts/inference/README.md

dakinggg · 2024-02-02T00:19:58Z

@megha95 can we merge this?

Co-authored-by: Daniel King <[email protected]>

Add setup instructions

9aad2a4

linden-li requested review from dskhudia, megha95 and dakinggg December 8, 2023 00:23

dakinggg reviewed Dec 8, 2023

View reviewed changes

megha95 approved these changes Dec 8, 2023

View reviewed changes

scripts/inference/README.md Show resolved Hide resolved

dakinggg added 2 commits December 8, 2023 15:14

Merge branch 'main' into linden/trtllm

ca1de05

Merge branch 'main' into linden/trtllm

f987ee8

dakinggg and others added 5 commits February 1, 2024 16:20

Merge branch 'main' into linden/trtllm

6666b10

Merge branch 'main' into linden/trtllm

3b14005

Update scripts/inference/README.md

1896cb1

Co-authored-by: Daniel King <[email protected]>

Merge branch 'main' into linden/trtllm

5ec3783

Merge branch 'main' into linden/trtllm

edd0967

dakinggg closed this Jul 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add setup instructions for TensorRT-LLM #789

Add setup instructions for TensorRT-LLM #789

linden-li commented Dec 7, 2023

dakinggg left a comment •

edited

Loading

dakinggg Dec 8, 2023

megha95 Dec 8, 2023

megha95 Feb 5, 2024

dakinggg commented Feb 2, 2024

Add setup instructions for TensorRT-LLM #789

Add setup instructions for TensorRT-LLM #789

Conversation

linden-li commented Dec 7, 2023

dakinggg left a comment • edited Loading

Choose a reason for hiding this comment

dakinggg Dec 8, 2023

Choose a reason for hiding this comment

megha95 Dec 8, 2023

Choose a reason for hiding this comment

megha95 Feb 5, 2024

Choose a reason for hiding this comment

dakinggg commented Feb 2, 2024

dakinggg left a comment •

edited

Loading