Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[TrtLLM] Python backend support for T5 model #1680

Merged
merged 4 commits into from
Mar 29, 2024

Conversation

sindhuvahinis
Copy link
Contributor

@sindhuvahinis sindhuvahinis commented Mar 27, 2024

Description

JIT compilation and others changes in another PR. Separated them for easy review. #1678

Changes in this PR:

  • Handler support to handle inference with python backend. tensorrt_llm_toolkit recognizes whether it is a python backend based on "use_python_backend" parameter sent with init_inference.
  • If rolling_batch is disabled, then the handler will automatically assume it is python backend and inits the inference. If the model is not supported by python backend, tensorrt_llm_toolkit will throw the error saying model is not supported.

@sindhuvahinis sindhuvahinis merged commit 27b70f1 into deepjavalibrary:master Mar 29, 2024
8 checks passed
@sindhuvahinis sindhuvahinis deleted the trt branch April 4, 2024 17:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants