diff --git a/src/User_Manual/config.yaml b/src/User_Manual/config.yaml
index e2567823..9a3dba0a 100644
--- a/src/User_Manual/config.yaml
+++ b/src/User_Manual/config.yaml
@@ -25,10 +25,13 @@ AVAILABLE_MODELS:
- jinaai/jina-embedding-t-en-v1
- jinaai/jina-embeddings-v2-base-en
- jinaai/jina-embeddings-v2-small-en
-COMPUTE_DEVICE: cuda
-EMBEDDING_MODEL_NAME: null
-chunk_overlap: 250
-chunk_size: 1500
+COMPUTE_DEVICE: cpu
+EMBEDDING_MODEL_NAME:
+chunk_overlap: 200
+chunk_size: 600
+database:
+ contexts: 15
+ similarity: 0.9
embedding-models:
bge:
query_instruction: 'Represent this sentence for searching relevant passages:'
@@ -38,7 +41,7 @@ embedding-models:
server:
api_key: ''
connection_str: http://localhost:1234/v1
- model_max_tokens: -1
+ model_max_tokens: 512
model_temperature: 0.1
prefix: '[INST]'
suffix: '[/INST]'
@@ -48,3 +51,7 @@ styles:
frame: 'background-color: #161b22;'
input: 'background-color: #2e333b; color: light gray; font: 13pt "Segoe UI Historic";'
text: 'background-color: #092327; color: light gray; font: 12pt "Segoe UI Historic";'
+transcriber:
+ device: cpu
+ model: base.en
+ quant: float32
diff --git a/src/User_Manual/settings.html b/src/User_Manual/settings.html
index 892130b0..a6af5cde 100644
--- a/src/User_Manual/settings.html
+++ b/src/User_Manual/settings.html
@@ -156,6 +156,16 @@
Chunk Overlap
it will automatically include, for example, the last 250 characters of the prior chunk. Feel free to experiment
with this setting as well to get the best results!
+ Database Settings
+ The Similarity
setting determines how similiar to your question the results from the database must be in
+ order for them to be considered to be sent to the LLM as "context." The closer the value to 1
the more
+ similar it must be, with a value of 1
meaning a verbatim match to your query. It's generally advised to
+ leave this unless you notice that you're not getting a sufficient number of contexts.
+
+ The Contexts
setting is more fun to play with. Here you can control the number of chunks that will be
+ forwarded to the LLM along with your question, for a response. HOWEVER, make sure and read my instructions above about how
+ to ensure that the LLM does not exceed its maximum context limit; otherwise, it'll give an error.
+
Break in Case of Emergency
All of the settings are kept in a config.yaml
file. If you accidentally change a setting you don't like or
its deleted or corrupted somehow, inside the "User Guide" folder I put a backup of the original file.
diff --git a/src/User_Manual/tips.html b/src/User_Manual/tips.html
index 71e86c73..4e3a18d4 100644
--- a/src/User_Manual/tips.html
+++ b/src/User_Manual/tips.html
@@ -164,10 +164,15 @@ Ask the Right Questions
"What is the statute of limitations for defamation?" versus "What is the statute of limitations for a defamation
action if the allegedly defamatory statement is in writing as opposed to verbal?" Experiment with how specific you are.
- Don't use multiple questions. For example, the results will be poor if you ask "What is the statute of limitations for a
+
My previous advice was to not ask multiple questions, but now that I've added an option to increase the number of
+ "contexts" from the database to the LLM, this is less stringent. I now encourage you ask longer-winded questions and even
+ general descriptions of the types of information you're looking for (not strictly a question you see). For reference, here
+ are my prior instructions:
+
+ Don't use multiple questions. For example, the results will be poor if you ask "What is the statute of limitations for a
defamation action?" AND "Can the statute of limitations tolled under certain circumstances?" at the same time. Instead,
reformulate your question into something like: "What is the statute of limitations for a defamation and can it be tolled
- under certain circumstances?" Again, just experiment and DO NOT assume that you must use a larger LLM or embedding model.
+ under certain circumstances?" Again, just experiment and DO NOT assume that you must use a larger LLM or embedding model.
Ensure Sufficient Context Length for the LLM
diff --git a/src/User_Manual/whisper_quants.html b/src/User_Manual/whisper_quants.html
index 4ee15e62..55190b63 100644
--- a/src/User_Manual/whisper_quants.html
+++ b/src/User_Manual/whisper_quants.html
@@ -103,6 +103,14 @@ Whisper Quants
+ As of Version 2.5
+
+ ALL transcriber settings have been moved to the GUI so they can be changed dynamically, easily. Therefore, the instructions
+ pertaining to modifying scripts to change them no longer applies. ALSO, no need to worry about which quants are available
+ on your CPU/GPU because the program will automatically detect compatible quants and only display those that are compatible!
+ I'm leaving the instructions below unchanged, however, to get this release out. You can still reference them for purposes
+ of understanding what the different settings represent or what not.
+
Changing Model Size and Quantization
The base.en
model with in float32
format is used by default. To use a different model size,