diff --git a/src/User_Manual/config.yaml b/src/User_Manual/config.yaml index e2567823..9a3dba0a 100644 --- a/src/User_Manual/config.yaml +++ b/src/User_Manual/config.yaml @@ -25,10 +25,13 @@ AVAILABLE_MODELS: - jinaai/jina-embedding-t-en-v1 - jinaai/jina-embeddings-v2-base-en - jinaai/jina-embeddings-v2-small-en -COMPUTE_DEVICE: cuda -EMBEDDING_MODEL_NAME: null -chunk_overlap: 250 -chunk_size: 1500 +COMPUTE_DEVICE: cpu +EMBEDDING_MODEL_NAME: +chunk_overlap: 200 +chunk_size: 600 +database: + contexts: 15 + similarity: 0.9 embedding-models: bge: query_instruction: 'Represent this sentence for searching relevant passages:' @@ -38,7 +41,7 @@ embedding-models: server: api_key: '' connection_str: http://localhost:1234/v1 - model_max_tokens: -1 + model_max_tokens: 512 model_temperature: 0.1 prefix: '[INST]' suffix: '[/INST]' @@ -48,3 +51,7 @@ styles: frame: 'background-color: #161b22;' input: 'background-color: #2e333b; color: light gray; font: 13pt "Segoe UI Historic";' text: 'background-color: #092327; color: light gray; font: 12pt "Segoe UI Historic";' +transcriber: + device: cpu + model: base.en + quant: float32 diff --git a/src/User_Manual/settings.html b/src/User_Manual/settings.html index 892130b0..a6af5cde 100644 --- a/src/User_Manual/settings.html +++ b/src/User_Manual/settings.html @@ -156,6 +156,16 @@

Chunk Overlap

it will automatically include, for example, the last 250 characters of the prior chunk. Feel free to experiment with this setting as well to get the best results!

+

Database Settings

+

The Similarity setting determines how similiar to your question the results from the database must be in + order for them to be considered to be sent to the LLM as "context." The closer the value to 1 the more + similar it must be, with a value of 1 meaning a verbatim match to your query. It's generally advised to + leave this unless you notice that you're not getting a sufficient number of contexts.

+ +

The Contexts setting is more fun to play with. Here you can control the number of chunks that will be + forwarded to the LLM along with your question, for a response. HOWEVER, make sure and read my instructions above about how + to ensure that the LLM does not exceed its maximum context limit; otherwise, it'll give an error.

+

Break in Case of Emergency

All of the settings are kept in a config.yaml file. If you accidentally change a setting you don't like or its deleted or corrupted somehow, inside the "User Guide" folder I put a backup of the original file.

diff --git a/src/User_Manual/tips.html b/src/User_Manual/tips.html index 71e86c73..4e3a18d4 100644 --- a/src/User_Manual/tips.html +++ b/src/User_Manual/tips.html @@ -164,10 +164,15 @@

Ask the Right Questions

"What is the statute of limitations for defamation?" versus "What is the statute of limitations for a defamation action if the allegedly defamatory statement is in writing as opposed to verbal?" Experiment with how specific you are.

-

Don't use multiple questions. For example, the results will be poor if you ask "What is the statute of limitations for a +

My previous advice was to not ask multiple questions, but now that I've added an option to increase the number of + "contexts" from the database to the LLM, this is less stringent. I now encourage you ask longer-winded questions and even + general descriptions of the types of information you're looking for (not strictly a question you see). For reference, here + are my prior instructions:

+ +

Don't use multiple questions. For example, the results will be poor if you ask "What is the statute of limitations for a defamation action?" AND "Can the statute of limitations tolled under certain circumstances?" at the same time. Instead, reformulate your question into something like: "What is the statute of limitations for a defamation and can it be tolled - under certain circumstances?" Again, just experiment and DO NOT assume that you must use a larger LLM or embedding model.

+ under certain circumstances?" Again, just experiment and DO NOT assume that you must use a larger LLM or embedding model.

Ensure Sufficient Context Length for the LLM

diff --git a/src/User_Manual/whisper_quants.html b/src/User_Manual/whisper_quants.html index 4ee15e62..55190b63 100644 --- a/src/User_Manual/whisper_quants.html +++ b/src/User_Manual/whisper_quants.html @@ -103,6 +103,14 @@

Whisper Quants

+

As of Version 2.5

+ +

ALL transcriber settings have been moved to the GUI so they can be changed dynamically, easily. Therefore, the instructions + pertaining to modifying scripts to change them no longer applies. ALSO, no need to worry about which quants are available + on your CPU/GPU because the program will automatically detect compatible quants and only display those that are compatible! + I'm leaving the instructions below unchanged, however, to get this release out. You can still reference them for purposes + of understanding what the different settings represent or what not.

+

Changing Model Size and Quantization

The base.en model with in float32 format is used by default. To use a different model size,