How to use two models in same inference code #2991
Unanswered
siddhantwaghjale
asked this question in
Q&A
Replies: 1 comment
-
I think it's not feasible with vllm currently(please correct me if I was wrong). But you can try to search "LLM gateway" in github. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I'm trying with two model inference in the same code using vLLM. But while trying to load the 2nd model it fails with error
AssertionError: tensor model parallel group is already initialized.
Any help will be appreciated
Beta Was this translation helpful? Give feedback.
All reactions