Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

setup llama_cpp for direct inference (support MacOS) #93

Merged
merged 1 commit into from
Jan 24, 2024

Conversation

rgbkrk
Copy link
Contributor

@rgbkrk rgbkrk commented Jan 17, 2024

This sets up functionary on llama_cpp in a reusable way, at least for MacOS.

  • The GGUF file for 7b-v2.2 gets downloaded from HuggingFace and cached for repeated runs
  • Inference is set up to take messages and tools in the OpenAI format
  • A FunctionRegistry from ChatLab can accept any functions with valid docstrings and types to make function declaration simple
  • An example call asking for the weather in multiple locations is requested
  • Multiple tool calls are handled via chatlab and returned back to the model for final inference
  • The model is able to synthesize information from all of the function calls to provide a final response for the user

image

A few liberties on my local setup:

  • My local environment is using Pydantic v2, as that's what is baked into ChatLab. I don't see anything in the current functionary code requiring it to stay on v1. I'd love to update that in the requirements.txt
  • termcolor is required for this example script as well as chatlab 1.3.0. These could be added into requirements.txt.

@musabgultekin
Copy link
Contributor

musabgultekin commented Jan 17, 2024

Thank you for the PR!

It looks good to me, maybe we can move this example to an examples topfolder.

@musabgultekin
Copy link
Contributor

Also, I think there was a small problem with pydantic v2. I cannot remember exactly. @khai-meetkai mentioned earlier on.

@rgbkrk rgbkrk changed the title setup llama_cpp server for use on MacOS setup llama_cpp for direct inference on MacOS Jan 17, 2024
@rgbkrk
Copy link
Contributor Author

rgbkrk commented Jan 17, 2024

Thank you for the PR!

It looks good to me, maybe we can move this example to an examples topfolder.

I'm happy to make this into a server as well since you've got all the definitions here. Relatedly, do you plan to release the functionary Python code as a PyPI package so that it can be used directly without cloning the repository?

Have you all considered adding this model to Ollama? I'd love to get it in more people's hands.

@rgbkrk
Copy link
Contributor Author

rgbkrk commented Jan 17, 2024

In order to add this example to an examples folder, I need it to be able to import functionary. I made a separate PR for the packaging aspect: #95

@rgbkrk rgbkrk changed the title setup llama_cpp for direct inference on MacOS setup llama_cpp for direct inference (support MacOS) Jan 19, 2024
@rgbkrk
Copy link
Contributor Author

rgbkrk commented Jan 19, 2024

Verifying this works with the v2.2 model now.

@shreyaskarnik
Copy link

This is awesome @rgbkrk! Unblocks a lot of macOS users. An addition to Ollama would be amazing!

@shreyaskarnik
Copy link

Following this example I was able to demonstrate how to use langchain tools with functionary https://gist.github.com/shreyaskarnik/2cc099528f14671b096570498330ae54 there may be a better way to handle the actual function execution and I am open to learning the right way to do so.

@rgbkrk rgbkrk force-pushed the llama_cpp_chatlab branch from f620871 to 736dcd5 Compare January 22, 2024 18:56
@musab-mk musab-mk merged commit c23a1a9 into MeetKai:main Jan 24, 2024
3 checks passed
@rgbkrk rgbkrk deleted the llama_cpp_chatlab branch January 24, 2024 06:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants