Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[TENSORRT-LLM] - Implement new looper thread based backend #2357

Merged
merged 63 commits into from
Oct 25, 2024

Conversation

mfuntowicz
Copy link
Member

Current backend implementation relies on locking mecanism to access, within each tokio's requests context thread, the executor on the C++ side.

This locking results in a heavy contention for all clients except the one which already acquired the lock in write mode.
This is inherent to the tokio's RwLock implementation which is giving priority to writers.

This backend implement a new, background thread based, TensorRT-LLM backend.
Everything related to the executor happens within this sole background thread and thus doesn't need any Sync nor RwLock to be implemented to make it work.

The only required trait is Send to be able to move the backend from the creator thread to the background thread in order to catch backend creation issue.

mfuntowicz and others added 29 commits October 21, 2024 09:57
# Conflicts:
#	backends/trtllm/src/backend.rs
@mfuntowicz mfuntowicz force-pushed the trtllm-executor-thread branch from 7f02f49 to fb00f98 Compare October 21, 2024 10:31
@mfuntowicz mfuntowicz marked this pull request as ready for review October 21, 2024 10:31
Narsil
Narsil previously approved these changes Oct 25, 2024
Copy link
Collaborator

@Narsil Narsil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@Narsil Narsil merged commit 43df056 into main Oct 25, 2024
8 checks passed
@Narsil Narsil deleted the trtllm-executor-thread branch October 25, 2024 05:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants