Release v6.7.0 - LONG CONTEXT no see! · BBC-Esq/VectorDB-Plugin-for-LM-Studio

General Updates

CITATIONS! with hyperlinks when searching the Vector DB and getting a response.

Display of a chat model's max context and how many tokens you've used.

2X Speed Increase

Choose "half" in the database creation settings. It will automatically choose `bfloat16` or `float16` based on your GPU.

This results in a 2x speed increase with extremely low loss in quality.

Chat Models

Removed Internlm2_5 - 1.8b and Qwen 1.5 - 1.6b as under performing.
Removed Dolphin-Llama 3 - 8b and Internlm2 - 20b as superseded.
Added Danube 3 - 4b with 8k context.
Added Phi 3.5 Mini - 4b with 8k context.
Added Hermes-4-Llama 3.1 - 8b with 8k context
Added Internlm2_5 - 20b with 8k context

The following models now have have 8192 context:

Model Name	Parameters (billion)	Context Length
Danube 3 - 4b	4	8192
Dolphin-Qwen 2 - 1.5b	1.5	8192
Phi 3.5 Mini - 4b	4	8192
Internlm2_5 - 7b	7	8192
Dolphin-Llama 3.1 - 8b	8	8192
Hermes-3-Llama-3.1 - 8b	8	8192
Dolphin-Qwen 2 - 7b	7	8192
Dolphin-Mistral-Nemo - 12b	12	8192
Internlm2_5 - 20b	20	8192

Text to Speech Models

Excited to add additional models to choose from when using whisperspeech as the text to speech backend - see the chart below for the various s2a and t2s model combinations and "relative" compute times along with real vram usage stats.

Current Chat and Vision Models

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v6.7.0 - LONG CONTEXT no see!

General Updates

2X Speed Increase

Choose "half" in the database creation settings. It will automatically choose `bfloat16` or `float16` based on your GPU.

This results in a 2x speed increase with extremely low loss in quality.

Chat Models

The following models now have have 8192 context:

Text to Speech Models

Current Chat and Vision Models

v6.7.0 - LONG CONTEXT no see!

General Updates

2X Speed Increase

Choose "half" in the database creation settings. It will automatically choose bfloat16 or float16 based on your GPU.

This results in a 2x speed increase with extremely low loss in quality.

Chat Models

The following models now have have 8192 context:

Text to Speech Models

Current Chat and Vision Models

Choose "half" in the database creation settings. It will automatically choose `bfloat16` or `float16` based on your GPU.