Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Self-suggesting chat model language and embedding model depending on machine #162

Open
fdominguezr opened this issue Sep 1, 2024 · 0 comments
Assignees
Labels
enhancement New feature or request

Comments

@fdominguezr
Copy link

Feature Area

Models' selection

Painpoint

I've struggled and nearly dropped Smart2Brain because my PC refused to index or it was taking too long.
After several searches in Internet, checking the documentation, FAQ, trials and errors, and dumping notesI found a combination of chat model language and embedding model that seems to work both "fast and accurate enough" with my machine and notes.
I don't believe nor expect most users will get all this pain, so they will turn off the plugin and get another solution, which is a pity at this point.

Describe your idea

The ideal solution should be:

  1. The used opens the tool with a "test vault".
  2. Then Smart2Brain run a series of tests combining chat models and embedding models, to find out which one gives the most accurate and fast performace.
  3. As the "test vault" is a known reference, both baseline speeds and answers are known too, so it's easy to find out how similar the test solutions and fast are.
  4. Smart2Brain will determines the optimal configuration for the current machine, and user can load its vault(s).
  5. Additionally, Smart2Brain can indicate in a message that the current PC configuration may produce too slow results. Indicating that a more powerful machine will be needed. Links to the FAQs that explain how some people upgrade their PCs with external GPUs (eGPU) can be useful to give some hope with a price ball park of $500-$800 for a "good enough" solution (I investigate that route too).

Alternatives

And intermediate (temporary) solution can be:

  1. Smart2Brain team asks to the community which combination of chat model and embedding models are using for their machines, indicating also (anonimously) the main features of their machines: CPU main RAM memory, GPU and Vcard Memory (there is software around that can capture that from the computer too, I don't know whether it's possible to integrate/suggest it).
  2. With community information, Smart2Brain team integrates a solution that evaluates/asks the user for their computer's features, and propose the optimal combination(s) of chat model and embedding model.
    (I can help you with this if you want to, but I think you're more than capable with to do it).

Additional Context

Sincerely I think Smart2Brain tool is very useful for different contexts, but usable speed is paramount: think about Google and their quest for webpage load speed; also how videos in Youtube got to shorts, and so on. Furthermore attention time is getting shorter.
So, helping the user (which shoulnd't be and IT/AI expert) to find out their optimal tool is necessary to improve adoption, usability and long term success.

@fdominguezr fdominguezr added the enhancement New feature or request label Sep 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants