Releases: BBC-Esq/VectorDB-Plugin-for-LM-Studio
v4.0 - CUDA 12.1+ support!
NOTE~
This release is only for Windows. Linux and MacOs users should continue to use v3.5.2 until I can get those versions up and running. Download the ZIP file from Release 3.5.2 and follow the instructions in the readme.md
INCLUDING the prerequisites, which are different than this release.
CUDA 12.1+ support finally brings support for flash attention 2 and other improvements. Those will be implemented in subsequent incremental releases. For this initial release, the following major improvements have been made:
The transcribe Tool
has had a major improvement due to CUDA 12.1+ being supported. It allowed switching from faster-whisper
to the amazing new library (only ~75 stars) located here:
https://github.com/shashikg/WhisperS2T
In summary, this library enables "batch" processing of audio using ctranslate2
version 4.0, which supports CUDA 12.1+. Here is a comparison of a long audio file under Release 3.5.1 versus this release:
Release 3.5.2
large-v2 model, float16 = 10 minutes 1 second
This release:
large-v2, float16, speed set at 50 = 54 seconds
medium.en, float16, speed set at 75 = 32 seconds
small.en, float16, speed set at 100 = 15 seconds!!!
This cannot be understated and has been a feature that faster-whisper
, while great, has been lacking for quite some time.
KNOWN ISSUES:
-
Bark models will still run but with errors printed to the command prompt. This will be fixed as flash attention 2 is implemented and the option to NOT use FA2 is made available, thus preventing the errors.
-
The voice transcriber sometimes takes way longer than in release 3.5.2 and/or prints multiple transcriptions. This is due to issues with the
faster-whisper
library itself - notctranslate2
- since the improved transcribe file tool works just fine and it relies onctranslate2
already. If it's not addressed in the future, this may require switching fromfaster-whisper
to something else likewhisperS2T
, but for shorter audiofaster-whisper
is probably just as good and I'd rather keep it if possible. -
The transcriber tool no longer lets you choose a quantization nor compute device (e.g. cuda or cpu). This was a choice in order to get initial CUDA 12+ support as soon as possible. It'll be addressed in subsequent releases.
Please contact me if you to help out if you want faster releases for Linux and MacoS as I don't own those systems. I plan to update support for Linux systems using Nvidia GPUs on windows and AMD GPUs on Linux, just like before, as well as MacOs support, just like before.
v3.5.2 - final CUDA 11.8 release
All subsequent releases will only support CUDA 12+ unless popular demand dictates otherwise.
v3.5 - revamp baby!
Fix a HUGE BUG preventing databases created with certain vector models from returning any results...apparently embeddings need to be "normalized" when using similarity search...
Transcribe file now adds the transcription into the DB, enabling metadata searching and filtering by document type!
Revamped GUI to afford tabs more space, including Databases tab that will need it to create multiple database in subsequent release.
Revamp github instructions and support matrix.
Refactoring
ELIMINATE annoying "qpainter" bug, hopefully for good! Dirty little bug bastard!
Change location to dowload bitsandbytes for windows during installation process, and minor improvements to installation procedures.
v3.4.5 - robust linux install
Expand the setup_linux.py script. If it doesn't work properly, Linux users can submit an "issue" and in the meantime use release 3.4.4 instead.
v3.4.4 - bug fixes
Refactor several scripts, offload verbose language to constants.py, hopefully fix once and for all copying pdf.py to the appropriate langchain source code directory on windows/linux/macos.
bitsandbytes-windows-0.41.2.post2-py3
bitsandbytes-win-0.41.2.post2-py3 add missing torch import
v3.4.2 - linux improv
Add additional script to streamline linux installation.
Minor changes such as revising requirements.txt to remove unnecessary imports, etc. that might have been causing some problems depending on a user's setup.
Minor refactoring.
v3.4.1 - bugs/refactor
Requirements.txt was installing langchain on linux when it should only be installed via replace_pdf.py to ensure that pdf.py is copied replaced within the langchain source code (otherwise pdf loader doesn't work).
Removed bitsandbytes macos install unless/until it's supported.
Removed Triton install from requirements.txt and put specific installation commands in github readme.
Refactors several scripts for manageability.
Revised setup.py to use tkinter for Windows users.
v3.4 - RIGHTEOUS SEARCH!
Finally caught a bug that for the last 6 months prevented the "similarity" search setting from working. Make sure and read the updated "tips" and "settings" portion of the user guide.
Refactored multiple scripts.
Specified utf-8 encoding throughout the program to fix bugs.
Consolidated the vision model loaders into new script named loader_images.py
.
v3.3.2 - bug fix and refactor
Fix bug when trying to search by document type.
Refactor three scripts into loader_images.py.