Replies: 1 comment
-
airllm+petals |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I have multiple macos machines lying around.
I guess with the layerd model approach it could be possible to preload them on different machines which should lead to faster inference response times.
Are there already plans to implement this?
Beta Was this translation helpful? Give feedback.
All reactions