08 Dec 16:27

BBC-Esq

42c962f

v6.11.0 - bug fixes Latest

Latest

Added/Removed Embedding Models

Added sentence-t5-xxl
- Massive model specifically geared towards finding sentences as close as possible to the sentence you pose in your query.

Documentation Scraper

Better colors.

Bug Fixes

Fixed a huge bug with chat models that prevented them from working at all.
Fixed all sentence-t5 models from using way too much memory.

Misc.

Improved the "chunks only" functionality and memory management in general.
Improved the layout of "chunks" returned when "chunks only" is selected.
Upgraded lots of dependencies.
Adjusted batch sizes for embedding models.

Upgrading from v6.9.x (will not work with prior versions):

To upgrade from a prior version without losing your databases, downloaded models, etc., do the following:

Download the source code for this release.
Take all files ending in .py and copy them, overwriting your pre-existing files. MAKE SURE and keep your current config.yaml.
Copy any files in the Assets folder and replace any ones in your current Assets folder.
Do the same for any files in the CSS folder.
Activate your virtual environment and run the following command to uninstall any & all dependencies.
- pip freeze > requirements_uninstall.txt && pip uninstall -r requirements_uninstall.txt -y && del requirements_uninstall.txt
- run python setup_windows.py

Assets 2

30 Nov 18:53

BBC-Esq

v6.10.1

a268e6f

v6.10.1 - Ovis & Mississippi

Added/Removed Vision Models

Added Mississippi - 2b
- This is an exciting new vision model that's 95% the quality of the larger models at 3-4x the speed.
- Relies on the InternViT-300M-448px vision tower.
Added Ovis 1.6 Llama3.2 - 3b model
- Another high quality model that's faster than the larger models with arguably the same quality.
- Relies on the siglip-so400m-patch14-384 vision tower.

Documentation Scraper

Now using watchdog for more accurate updates on the number of pages scraped.

Misc.

Started using the great ruff library to fix my code.

Upgrading from v6.9.x (will not work with prior versions):

To upgrade from a prior version without losing your databases, downloaded models, etc., do the following:

Download the source code for this release.
Take all files ending in .py and copy them, overwriting your pre-existing files. MAKE SURE and keep your current config.yaml.
Copy any files in the Assets folder and replace any ones in your current Assets folder.
Do the same for any files in the CSS folder.
Activate your virtual environment and run the following command to uninstall any & all dependencies.
- pip freeze > requirements_uninstall.txt && pip uninstall -r requirements_uninstall.txt -y && del requirements_uninstall.txt
- run python setup_windows.py

Assets 2

25 Nov 21:54

BBC-Esq

v6.10.0

594dfb1

v6.10.0 - So much time wasted!

Upgrading from v6.9.x

To upgrade from a prior version without losing your databases, downloaded models, etc., do the following:

Download the source code for this release.
Take all files ending in .py and copy them, overwriting your pre-existing files. MAKE SURE and keep your current config.yaml.
Copy any files in the Assets folder and replace any ones in your current Assets folder.
Do the same for any files in the CSS folder.
Activate your virtual environment and run the following command to uninstall any & all dependencies.
- pip freeze > requirements_uninstall.txt && pip uninstall -r requirements_uninstall.txt -y && del requirements_uninstall.txt
- run python setup_windows.py

Added/Removed Chat Models

Added Marco-o1 - 7b
- This is a superb model that performs chain-of-thought. It is slower than other models (thinking behind the scenes), but produces extremely accurate results even with long contexts.
Added Qwen 2.5 Coder - 3b
Added Qwen 2.5 Coder - 14b
Added Qwen 2.5 - 32b
Added Qwen 2.5 Coder - 32b
Removed Internlm2_5 - 1.8b
Removed Yi Coder - 9b
Removed Internlm2_5 - 7b
Removed DeepSeek Coder v2 - 16b
- Eclipsed by the better and faster Qwen 2.5 Coder - 14b model.
Removed Internlm2_5 - 20b
- Eclipsed by Qwen 2.5 - 14b and/or Qwen 2.5 32b

Added/Removed Vision Models

Added THUDM glm4v - 9b
Added Molmo-D-0924 - 8b

Scrape Python Library Documentation

Added multiple new libraries and newer versions of existing libraries to scrape.
Improved .html formatting of scraped cites to feed the vector database more relevant information.
Increased the speed and reliability of scraping documentation by using lxml within beautifulsoup4 and other alternations.
Created module_scraper.py to handle scraping logic while gui_tabs_tools_scrape.py still provides the GUI.

Ask Jeeves Improvements

Added more error handling and server connection indicators as well as better subprocess handling.

Other Improvements

Updated dependencies.
Refactored document_processor.py to accommodate newer dependencies and load .html file faster.
Additional file filters to reduce download amounts when downloading models.
Added a button to display vision model bar charts.

Assets 2

27 Oct 15:54

BBC-Esq

v6.9.2

7ee41ff

v6.9.2 - Welcome Kobold!

Patch 6.9.2 Notes

In between major updates I'll simply paste below the major update notes so it's more convenient, and then include specific notes for minor updates.

Added MiniCPM3 - 4b chat model
- Very very good at at single factoid retrieval, even from many contexts, but DO NOT use when asking to retrieve multiple factoids from the contexts because it will ramble.
Robust validation of settings entered.
Use qthread with metrics bar for smoother GUI operation
Add page numbers when contexts that originate from a .pdf are returned.
Return relevant scores for all citations, which helps users determine which similarity setting to use.

Patch 6.9.1 Notes

Added Qwen 2.5 - 32b chat model.
Add sparkgraphs for metrics and the ability to right-click on the metrics bar and select a different visualization.

Welcome Kobold edition v6.9.0

Ask Jeeves!

Exciting new "Ask Jeeves" helper who answers questions about how to use the program. Simply click "Jeeves" in the upper left.
"Jeeves" gets his knowledge from a vector database that comes shipped with this release! NO MORE USER GUIDE TAB - just ASK JEEVES!
- IMPORTANT: After running setup_windows.py you must go into the Assets folder, right-click on koboldcpp_nocuda.exe, and check the "Unblock" checkbox first! If it's not there, try starting Jeeves and see if it works. Create a Github Issue if it doesn't work because Ask Jeeves is a new feature.
- IMPORTANT: You may also need to disable or make an exception for any firewall you have. Submit a Github Issue if you encounter any problems.

Scrape Python Library Documentation

In the Tools Tab, simply select a python library, click Scrape, and all the .html files will be downloaded to the Scraped_Documentation folder.
Create a vector database out of all of the .html files for a given library, then use one of the coding specific models to answer questions!

Huggingface Access Token

You can now enter an "access token" and access models that are "gated" on huggingface. Currently, llama 3.2 - 3b and mistral-small - 22b are the only gated models.
Ask Jeeves how to get a huggingface access token.

Other Improvements

The vector models are now downloaded using the snapshot_download functionality from huggingface_hub, which can exclude unnecessary files such as onnx, .bin (when an equivalent .safetensors version is available), and others. This significantly reduces the amount of data that this program downloads and therefore increases speed and usability.
This speedup should pertain to vector, chat, and whisper models, and implementing the snapshot_download for TTS models is planned.
New Compare GPUs button in the Tools Tab, which displays metrics for various GPUs so you can better determine your settings. Charts and graphs for chat/vision models will be added in the near future.
New metrics bar with speedometer-looking widgets.
Removed the User Guide Tab altogether to free up space. You can now simply Ask Jeeves instead.
Lots and lots of refactoring to improve various things...

Added/Removed Chat Models

Added Qwen 2.5 - 1.5b, Llama 3.2 - 3b, Internlm 2.5 - 1.8b, Dolphin-Llama 3.1 - 8b, Mistral-Small - 22b.
Removed Longwriter Llama 3.1 - 8b, Longwriter GLM4 - 9b, Yi - 9b, Solar Pro Preview - 22.1b.

Added/Removed Vision Models

Removed Llava 1.5, Bakllava, Falcon-vlm - 11b, and Phi-3-Vision models as either under-performing or eclipsed by pre-existing models that have additional benefits.

Roadmap

Add Kobold as a backend in addition to LM Studio and Local Models, at which point I'll probably have to rename this github repo.
Add OpenAI backend.
Remove LM Studio Server settings and revise instructions since LM Studio has changed significantly since they were last done.

Full Changelog: v6.8.2...v6.9.0

Assets 2

21 Oct 21:12

BBC-Esq

v6.9.1

f1f0b53

v6.9.1 - Welcome Kobold!

Patch 6.9.1 Notes

In between major updates I'll simply paste below the major update notes so it's more convenient, and then include specific notes for minor updates.

Added Qwen 2.5 - 32b chat model.
Add sparkgraphs for metrics and the ability to right-click on the metrics bar and select a different visualization.

Welcome Kobold edition v6.9.0

Ask Jeeves!

Exciting new "Ask Jeeves" helper who answers questions about how to use the program. Simply click "Jeeves" in the upper left.
"Jeeves" gets his knowledge from a vector database that comes shipped with this release! NO MORE USER GUIDE TAB - just ASK JEEVES!
- IMPORTANT: After running setup_windows.py you must go into the Assets folder, right-click on koboldcpp_nocuda.exe, and check the "Unblock" checkbox first! If it's not there, try starting Jeeves and see if it works. Create a Github Issue if it doesn't work because Ask Jeeves is a new feature.
- IMPORTANT: You may also need to disable or make an exception for any firewall you have. Submit a Github Issue if you encounter any problems.

Scrape Python Library Documentation

In the Tools Tab, simply select a python library, click Scrape, and all the .html files will be downloaded to the Scraped_Documentation folder.
Create a vector database out of all of the .html files for a given library, then use one of the coding specific models to answer questions!

Huggingface Access Token

You can now enter an "access token" and access models that are "gated" on huggingface. Currently, llama 3.2 - 3b and mistral-small - 22b are the only gated models.
Ask Jeeves how to get a huggingface access token.

Other Improvements

The vector models are now downloaded using the snapshot_download functionality from huggingface_hub, which can exclude unnecessary files such as onnx, .bin (when an equivalent .safetensors version is available), and others. This significantly reduces the amount of data that this program downloads and therefore increases speed and usability.
This speedup should pertain to vector, chat, and whisper models, and implementing the snapshot_download for TTS models is planned.
New Compare GPUs button in the Tools Tab, which displays metrics for various GPUs so you can better determine your settings. Charts and graphs for chat/vision models will be added in the near future.
New metrics bar with speedometer-looking widgets.
Removed the User Guide Tab altogether to free up space. You can now simply Ask Jeeves instead.
Lots and lots of refactoring to improve various things...

Added/Removed Chat Models

Added Qwen 2.5 - 1.5b, Llama 3.2 - 3b, Internlm 2.5 - 1.8b, Dolphin-Llama 3.1 - 8b, Mistral-Small - 22b.
Removed Longwriter Llama 3.1 - 8b, Longwriter GLM4 - 9b, Yi - 9b, Solar Pro Preview - 22.1b.

Added/Removed Vision Models

Removed Llava 1.5, Bakllava, Falcon-vlm - 11b, and Phi-3-Vision models as either under-performing or eclipsed by pre-existing models that have additional benefits.

Roadmap

Add Kobold as a backend in addition to LM Studio and Local Models, at which point I'll probably have to rename this github repo.
Add OpenAI backend.
Remove LM Studio Server settings and revise instructions since LM Studio has changed significantly since they were last done.

Full Changelog: v6.8.2...v6.9.0

Assets 2

14 Oct 03:20

BBC-Esq

v6.9.0

a78b4f3

v6.9.0 - Welcome Kobold!!

Welcome Kobold edition

Ask Jeeves!

Exciting new "Ask Jeeves" helper who answers questions about how to use the program. Simply click "Jeeves" in the upper left.
"Jeeves" gets his knowledge from a vector database that comes shipped with this release! NO MORE USER GUIDE TAB - just ASK JEEVES!
- IMPORTANT: After running setup_windows.py you must go into the Assets folder, right-click on koboldcpp_nocuda.exe, and check the "Unblock" checkbox first! If it's not there, try starting Jeeves and see if it works. Create a Github Issue if it doesn't work because Ask Jeeves is a new feature.
- IMPORTANT: You may also need to disable or make an exception for any firewall you have. Submit a Github Issue if you encounter any problems.

Scrape Python Library Documentation

In the Tools Tab, simply select a python library, click Scrape, and all the .html files will be downloaded to the Scraped_Documentation folder.
Create a vector database out of all of the .html files for a given library, then use one of the coding specific models to answer questions!

Huggingface Access Token

You can now enter an "access token" and access models that are "gated" on huggingface. Currently, llama 3.2 - 3b and mistral-small - 22b are the only gated models.
Ask Jeeves how to get a huggingface access token.

Other Improvements

The vector models are now downloaded using the snapshot_download functionality from huggingface_hub, which can exclude unnecessary files such as onnx, .bin (when an equivalent .safetensors version is available), and others. This significantly reduces the amount of data that this program downloads and therefore increases speed and usability.
This speedup should pertain to vector, chat, and whisper models, and implementing the snapshot_download for TTS models is planned.
New Compare GPUs button in the Tools Tab, which displays metrics for various GPUs so you can better determine your settings. Charts and graphs for chat/vision models will be added in the near future.
New metrics bar with speedometer-looking widgets.
Removed the User Guide Tab altogether to free up space. You can now simply Ask Jeeves instead.
Lots and lots of refactoring to improve various things...

Added/Removed Chat Models

Added Qwen 2.5 - 1.5b, Llama 3.2 - 3b, Internlm 2.5 - 1.8b, Dolphin-Llama 3.1 - 8b, Mistral-Small - 22b.
Removed Longwriter Llama 3.1 - 8b, Longwriter GLM4 - 9b, Yi - 9b, Solar Pro Preview - 22.1b.

Added/Removed Vision Models

Removed Llava 1.5, Bakllava, Falcon-vlm - 11b, and Phi-3-Vision models as either under-performing or eclipsed by pre-existing models that have additional benefits.

Roadmap

Add Kobold as a backend in addition to LM Studio and Local Models, at which point I'll probably have to rename this github repo.
Add OpenAI backend.
Remove LM Studio Server settings and revise instructions since LM Studio has changed significantly since they were last done.

Full Changelog: v6.8.2...v6.9.0

Assets 2

14 Sep 20:09

BBC-Esq

v6.8.2

1b48c1b

v6.8.2 - quality focus

Due to the growing of number of chat and vector models with larger contexts, this mini-release focused on extensive testing at longer contexts. From this point forward, 4k chat models will only be included if they're exceptional and 8k++ models if they're quality and/or offer unique characteristics (e.g. focused on coding etc.)

Added Models (all 8k++ context):

LongWriter Llama 3.1 - 8b
- - Exceptional at long responses where an unusual number of contexts are thrown at it.
Yi - 9b
- - Long context Yi 9b, replacing Dolphin-Yi 9b, which was under performing at long context.
Solar Pro Preview - 22.1b
- - Exceptional 4k model with a parent model that's 8k coming out in a few months; replaces Solar 10.7b.

Removed Models:

Danube 3 - 4b - under performing at long context
Dolphin-Qwen 2 - 1.5b - under performing at long context
Orca 2 - 7b - superseded
Neural-Chat - 7b - superseded
Dolphin-Llama 3.1 - 8b - superseded
Hermes-3-Llama-3.1 - 8b - superseded
Dolphin-Yi 1.5 - 9b - redundant
Dolphin-Qwen 2 - 7b - superseded
Dolphin-Phi 3 - Medium - too difficult to work with and superseded
Llama 2 - 13b - superseded
Dolphin-Mistral-Nemo - 12b - too difficult to work with and superseded
SOLAR - 10.7b - superseded

See Release 6.8 for full release notes, including how to upgrade old databases

Current Chat Models:

Current Vision Models

Current TTS Models

Does not include Google's, which is online and hence no GPU or VRAM usage.

Assets 2

07 Sep 02:07

BBC-Esq

v6.8.1

853f448

v6.8.1 - coding models!

Fix

Added a single missing dependency from the last release.

See Release 6.8.0 for full notes, including how to update databases.

Assets 2

05 Sep 15:57

BBC-Esq

v6.8.0

d4528d6

v6.8.0 - coding models!

Breaking Changes

Within the Manage Databases tab, what's displayed is no longer derived from parsing multiple JSON files. Rather, sqlite3 is used for much much faster response/latency.

As such, databases created prior to this release will not function properly. To migrate old databases rather than creating them anew, use the `create_sqlite3.py` attached to this release as follows:

Run the script and select the folder containing the old JSON files. The script will folder location is as follows:

Go into the Vector_DB folder.
Each folder within Vector_DB constitute a database.
Within each folder constituting a database there is a folder named json.
You need to run the create_sqlite3.py script selecting each json folder for each of your database folders.
A new file named metadata.db should be created in each database folder.
It's now safe to remove the json folder altogether.

Update the backup folder.

The Vector_DB_Backup folder contains a mirror image of the Vector_DB - this is the "backup" folder.
Delete all the contents within the Vector_DB_Backup folder and copy all the contents of the Vector_DB folder into it.

Start the program and it should run as usual.

These steps are unnecessary if you're creating a vector database for first time using Release v6.8.0, obviously.

New Chat Models for Coding Questions

Deepseek-Coder-V2 - 16b (best)
Yi-Coder - 9b (second best)
CodeQwen1.5 - 7b (third, but still good)

Benchmarks will be forthcoming, but their metrics are comparable to similarly-sized models.

Misc.

Now using a distil-whisper model for the voice recorder for an approximate 2x speedup.
Added buttons to backup all databases at once or restored all backups at once (Tools Tab).

Assets 3

26 Aug 15:01

BBC-Esq

v6.7.1

68c5578

v6.7.1 - patch update

Patch

Fixed Florence models not showing up as an option for vision models when a gpu was detected.

See v6.7.0 for all other release notes:

https://github.com/BBC-Esq/VectorDB-Plugin-for-LM-Studio/releases/tag/V6.7.0

Assets 2

Releases: BBC-Esq/VectorDB-Plugin-for-LM-Studio

v6.11.0 - bug fixes

Added/Removed Embedding Models

Documentation Scraper

Bug Fixes

Misc.

Upgrading from v6.9.x (will not work with prior versions):

v6.10.1 - Ovis & Mississippi

Added/Removed Vision Models

Documentation Scraper

Misc.

Upgrading from v6.9.x (will not work with prior versions):

v6.10.0 - So much time wasted!

Upgrading from v6.9.x

Added/Removed Chat Models

Added/Removed Vision Models

Scrape Python Library Documentation

Ask Jeeves Improvements

Other Improvements

v6.9.2 - Welcome Kobold!

Patch 6.9.2 Notes

Patch 6.9.1 Notes

Welcome Kobold edition v6.9.0

Ask Jeeves!

Scrape Python Library Documentation

Huggingface Access Token

Other Improvements

Added/Removed Chat Models

Added/Removed Vision Models

Roadmap

v6.9.1 - Welcome Kobold!

Patch 6.9.1 Notes

Welcome Kobold edition v6.9.0

Ask Jeeves!

Scrape Python Library Documentation

Huggingface Access Token

Other Improvements

Added/Removed Chat Models

Added/Removed Vision Models

Roadmap

v6.9.0 - Welcome Kobold!!

Welcome Kobold edition

Ask Jeeves!

Scrape Python Library Documentation

Huggingface Access Token

Other Improvements

Added/Removed Chat Models

Added/Removed Vision Models

Roadmap

v6.8.2 - quality focus

Added Models (all 8k++ context):

Removed Models:

See Release 6.8 for full release notes, including how to upgrade old databases

Current Chat Models:

Current Vision Models

Current TTS Models

v6.8.1 - coding models!

Fix

See Release 6.8.0 for full notes, including how to update databases.

v6.8.0 - coding models!

Breaking Changes

As such, databases created prior to this release will not function properly. To migrate old databases rather than creating them anew, use the create_sqlite3.py attached to this release as follows:

New Chat Models for Coding Questions

Misc.

v6.7.1 - patch update

Patch

See v6.7.0 for all other release notes:

As such, databases created prior to this release will not function properly. To migrate old databases rather than creating them anew, use the `create_sqlite3.py` attached to this release as follows: