Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added local llm support to cypher-core. #2

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

skillsharer
Copy link

Summary of Changes:

Added comprehensive support for Huggingface models, including server setup and model implementations. Currently Qwen/Qwen2.5-7B-Instruct supported. Further information is in the src/huggingface/README.md

Introduced new TypeScript adapters and clients for the Qwen model.

@kingbootoshi
Copy link
Owner

yo qwen is sick. does qwen need it's own adapter? or can the adapter be hugging face in general?

just tryna figure out how the hugging face interface works

@skillsharer
Copy link
Author

That was one of the questions in my head also. There are multiple factors which we need to consider.

  • LLMs, especially local ones respond in different ways, because they have less parameters and trained on less data and sometimes more specific data (e.g. see the difference in instruct based vs. general LLMs)
  • Where do you want to handle the output? If you want to keep the output handling on the typescript side, it would be consistent with the API adapters, therefore multiple adapters needed for the Huggingface LLMs also. If not, then the output handling could go into the server side, where it would be consistent with those classes. But it can result a big (probably hardly maintanable) adapter later on if we use multiple LLMs.
  • We need to consider image and video input/output handling as well.

@kingbootoshi
Copy link
Owner

https://huggingface.co/Qwen/QVQ-72B-Preview

would this support this reasoning model for qwen ? has vision tech. looks insane

we should handle output on the baseAgent typescript side so it's consistent

for LLM models that don't have image support, i wanted to route them through a fireworks vision model and just get text returned added to the base agent chat history so it knows the context of the image (explain this image in deep detail)

@skillsharer
Copy link
Author

The routing of images is a great idea! I checked the 72Billion model. Naive me hoped that I can run on my Macbook pro. Despite the 64 GB ram and the M2 architecture, I was not able to run the inference. I created a branch on my cypher-genesis fork, where the server side is done for this model: https://github.com/skillsharer/cypher-core/tree/feature/integrate-qwen-vision-model
If you have enough resource, go and try out! However, these models are quite large for everyday usage locally. We could distribute the workload, and decide what and when run locally and when we should call APIs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants