From ebe8821aca532e927b78e76dcbb0ae6e6d91f585 Mon Sep 17 00:00:00 2001 From: Sindhu Somasundaram <56774226+sindhuvahinis@users.noreply.github.com> Date: Mon, 20 Nov 2023 17:26:18 -0800 Subject: [PATCH] [doc] Placeholder for TrtLLM tutorial and tuning guide (#1333) --- .../lmi/configurations_large_model_inference_containers.md | 2 +- serving/docs/lmi/tuning_guides/trtllm_tuning_guide.md | 3 +++ serving/docs/lmi/tutorials/trtllm_aot_tutorial.md | 3 +++ 3 files changed, 7 insertions(+), 1 deletion(-) create mode 100644 serving/docs/lmi/tuning_guides/trtllm_tuning_guide.md create mode 100644 serving/docs/lmi/tutorials/trtllm_aot_tutorial.md diff --git a/serving/docs/lmi/configurations_large_model_inference_containers.md b/serving/docs/lmi/configurations_large_model_inference_containers.md index fa2255757..4f625b6dd 100644 --- a/serving/docs/lmi/configurations_large_model_inference_containers.md +++ b/serving/docs/lmi/configurations_large_model_inference_containers.md @@ -3,7 +3,7 @@ There are a number of shared configurations for python models running large language models. They are also available through the [Large Model Inference Containers](https://github.com/aws/deep-learning-containers/blob/master/available_images.md#large-model-inference-containers). -### Common ([doc](https://github.com/deepjavalibrary/djl-serving/blob/521e0edadec35b04ec9e1d51b9e406119efd0235/serving/docs/configurations_large_model_inference_containers.md#common-doc)) +### Common ([doc](https://docs.aws.amazon.com/sagemaker/latest/dg/large-model-inference-configuration.html)) | Item | Required | Description | Example value | |----------------------------------------------------|-----------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------| diff --git a/serving/docs/lmi/tuning_guides/trtllm_tuning_guide.md b/serving/docs/lmi/tuning_guides/trtllm_tuning_guide.md new file mode 100644 index 000000000..c56749430 --- /dev/null +++ b/serving/docs/lmi/tuning_guides/trtllm_tuning_guide.md @@ -0,0 +1,3 @@ +# TensorRT LLM Tuning guide + +This doc recommends the configurations based on your model and instance type. \ No newline at end of file diff --git a/serving/docs/lmi/tutorials/trtllm_aot_tutorial.md b/serving/docs/lmi/tutorials/trtllm_aot_tutorial.md new file mode 100644 index 000000000..31533f14c --- /dev/null +++ b/serving/docs/lmi/tutorials/trtllm_aot_tutorial.md @@ -0,0 +1,3 @@ +# TensorRT LLM ahead of time compilation of models + +This doc helps you to convert your HuggingFace model to Tensorrt-LLM LMI model format to load and run inference with Tensorrt-LLM. \ No newline at end of file