Skip to content

Commit

Permalink
Merge pull request #4961 from oobabooga/dev
Browse files Browse the repository at this point in the history
Merge dev branch
  • Loading branch information
oobabooga authored Dec 17, 2023
2 parents 443be39 + f1f2c4c commit 7be0983
Show file tree
Hide file tree
Showing 30 changed files with 928 additions and 230 deletions.
5 changes: 3 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ Its goal is to become the [AUTOMATIC1111/stable-diffusion-webui](https://github.
* Dropdown menu for quickly switching between different models.
* Large number of extensions (built-in and user-contributed), including Coqui TTS for realistic voice outputs, Whisper STT for voice inputs, translation, [multimodal pipelines](https://github.com/oobabooga/text-generation-webui/tree/main/extensions/multimodal), vector databases, Stable Diffusion integration, and a lot more. See [the wiki](https://github.com/oobabooga/text-generation-webui/wiki/07-%E2%80%90-Extensions) and [the extensions directory](https://github.com/oobabooga/text-generation-webui-extensions) for details.
* [Chat with custom characters](https://github.com/oobabooga/text-generation-webui/wiki/03-%E2%80%90-Parameters-Tab#character).
* Precise chat templates for instruction-following models, including Llama-2-chat, Alpaca, Vicuna, Mistral, and many others.
* Precise chat templates for instruction-following models, including Llama-2-chat, Alpaca, Vicuna, Mistral.
* LoRA: train new LoRAs with your own data, load/unload LoRAs on the fly for generation.
* Transformers library integration: load models in 4-bit or 8-bit precision through bitsandbytes, use llama.cpp with transformers samplers (`llamacpp_HF` loader), CPU inference in 32-bit precision using PyTorch.
* OpenAI-compatible API server with Chat and Completions endpoints -- see the [examples](https://github.com/oobabooga/text-generation-webui/wiki/12-%E2%80%90-OpenAI-API#examples).
Expand Down Expand Up @@ -274,6 +274,7 @@ List of command-line flags
|`--cfg-cache` | ExLlama_HF: Create an additional cache for CFG negative prompts. Necessary to use CFG with that loader, but not necessary for CFG with base ExLlama. |
|`--no_flash_attn` | Force flash-attention to not be used. |
|`--cache_8bit` | Use 8-bit cache to save VRAM. |
|`--num_experts_per_token NUM_EXPERTS_PER_TOKEN` | Number of experts to use for generation. Applies to MoE models like Mixtral. |

#### AutoGPTQ

Expand Down Expand Up @@ -377,7 +378,7 @@ text-generation-webui
└── llama-2-13b-chat.Q4_K_M.gguf
```

* Other models (like 16-bit transformers models and GPTQ models) are made of several files and must be placed in a subfolder. Example:
* The remaining model types (like 16-bit transformers models and GPTQ models) are made of several files and must be placed in a subfolder. Example:

```
text-generation-webui
Expand Down
49 changes: 31 additions & 18 deletions css/html_instruct_style.css
Original file line number Diff line number Diff line change
@@ -1,10 +1,18 @@
.chat {
background: var(--block-background-fill);
padding: 24px 19px;
padding-right: 19px !important;
border: 1px solid var(--block-border-color);
border-radius: 8px;
}

.message {
display: grid;
grid-template-columns: 60px 1fr;
padding-bottom: 25px;
font-size: 15px;
font-family: 'Noto Sans', Helvetica, Arial, sans-serif;
line-height: 22px;
line-height: 24px;
}

.username {
Expand All @@ -13,11 +21,16 @@

.message-body p, .message-body li {
font-size: 15px !important;
line-height: 22.5px !important;
line-height: 24px !important;
list-style-position: outside;
}

.message-body p, .chat .message-body ul, .chat .message-body ol {
margin-bottom: 23.4375px !important;
margin-bottom: 16px !important;
}

.chat .message-body ul, .chat .message-body ol {
padding-inline-start: 2em;
}

.message-body p:last-child, .chat .message-body ul:last-child, .chat .message-body ol:last-child {
Expand All @@ -34,34 +47,34 @@

.gradio-container .chat .assistant-message {
padding: 20px;
border-radius: 20px;
background-color: #0000000f;
margin-top: 9px !important;
margin-bottom: 18px !important;
background: var(--background-fill-secondary);
margin-top: 12px !important;
margin-bottom: 24px !important;
margin-right: 16px;
border-radius: 22px;
border-bottom-left-radius: 0;
border: 1px solid var(--border-color-primary);
}

.gradio-container .chat .user-message {
padding: 20px;
background-color: var(--color-accent-soft);
border-radius: 20px;
margin-bottom: 9px !important;
margin-bottom: 12px !important;
margin-left: 16px;
border-radius: 22px;
border-bottom-right-radius: 0;
border: 1px solid var(--border-color-accent-subdued);
}

.gradio-container .chat .assistant-message:last-child, .gradio-container .chat .user-message:last-child {
margin-bottom: 0 !important;
}

.dark .chat .assistant-message {
background-color: #1f2937;
}

.dark .chat .user-message {
background-color: transparent;
}

code {
background-color: white !important;
background-color: #f3f4f6 !important;
}

.dark code {
background-color: #0e1321 !important;
background-color: #1f2937 !important;
}
2 changes: 1 addition & 1 deletion css/main.css
Original file line number Diff line number Diff line change
Expand Up @@ -332,7 +332,7 @@ div.svelte-362y77>*, div.svelte-362y77>.form>* {
margin-left: auto;
margin-right: auto;
max-width: 880px;
height: 100%;
min-height: var(--chat-height);
overflow-y: auto;
padding-right: 15px;
display: flex;
Expand Down
7 changes: 0 additions & 7 deletions grammars/japanese.gbnf

This file was deleted.

23 changes: 6 additions & 17 deletions grammars/json.gbnf
Original file line number Diff line number Diff line change
@@ -1,25 +1,14 @@
root ::= object
value ::= object | array | string | number | ("true" | "false" | "null") ws

object ::=
"{" ws (
string ":" ws value
("," ws string ":" ws value)*
)? "}" ws
object ::= "{" ws ( string ":" ws value ("," ws string ":" ws value)* )? "}"

value ::= object | array | string | number | ("true" | "false" | "null") ws

array ::=
"[" ws (
value
("," ws value)*
)? "]" ws
array ::= "[" ws ( value ("," ws value)* )? "]" ws

string ::=
"\"" (
[^"\\] |
"\\" (["\\/bfnrt] | "u" [0-9a-fA-F] [0-9a-fA-F] [0-9a-fA-F] [0-9a-fA-F]) # escapes
)* "\"" ws
string ::= "\"" ( [a-zA-Z0-9] )* "\"" ws

number ::= ("-"? ([0-9] | [1-9] [0-9]*)) ("." [0-9]+)? ([eE] [-+]? [0-9]+)? ws

# Optional space: by convention, applied in this grammar after literal chars when allowed

ws ::= ([ \t\n] ws)?
34 changes: 0 additions & 34 deletions grammars/json_arr.gbnf

This file was deleted.

14 changes: 14 additions & 0 deletions grammars/json_w_trailing_space.gbnf
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
root ::= object

object ::= "{" ws ( string ":" ws value ("," ws string ":" ws value)* )? "}" ws

value ::= object | array | string | number | ("true" | "false" | "null") ws

array ::= "[" ws ( value ("," ws value)* )? "]" ws

string ::= "\"" ( [a-zA-Z0-9] )* "\"" ws

number ::= ("-"? ([0-9] | [1-9] [0-9]*)) ("." [0-9]+)? ([eE] [-+]? [0-9]+)? ws


ws ::= ([ \t\n] ws)?
6 changes: 2 additions & 4 deletions grammars/list.gbnf
Original file line number Diff line number Diff line change
@@ -1,4 +1,2 @@
root ::= item+

# Excludes various line break characters
item ::= "- " [^\r\n\x0b\x0c\x85\u2028\u2029]+ "\n"
root ::= "1. " paragraph "\n" ([0-9] [0-9]? ". " paragraph "\n")+
paragraph ::= [a-zA-Z'.,; ]+
7 changes: 7 additions & 0 deletions grammars/simple_arithmetic.gbnf
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
root ::= (expr "=" ws term "\n")+
expr ::= term ([-+*/] term)*
term ::= num | "(" ws expr ")" ws
num ::= [0-9]+ ws
ws ::= [ \t\n]*
# this is a comment

64 changes: 14 additions & 50 deletions js/main.js
Original file line number Diff line number Diff line change
Expand Up @@ -123,6 +123,8 @@ targetElement.addEventListener("scroll", function() {
// Create a MutationObserver instance
const observer = new MutationObserver(function(mutations) {
mutations.forEach(function(mutation) {
updateChatHeight();

if(!isScrolled) {
targetElement.scrollTop = targetElement.scrollHeight;
}
Expand Down Expand Up @@ -153,56 +155,6 @@ const config = {
// Start observing the target element
observer.observe(targetElement, config);

//------------------------------------------------
// Notebook box scrolling
//------------------------------------------------
const notebookElement = document.querySelector("#textbox-notebook textarea");
let notebookScrolled = false;

notebookElement.addEventListener("scroll", function() {
let diff = notebookElement.scrollHeight - notebookElement.clientHeight;
if(Math.abs(notebookElement.scrollTop - diff) <= 10 || diff == 0) {
notebookScrolled = false;
} else {
notebookScrolled = true;
}
});

const notebookObserver = new MutationObserver(function(mutations) {
mutations.forEach(function(mutation) {
if(!notebookScrolled) {
notebookElement.scrollTop = notebookElement.scrollHeight;
}
});
});

notebookObserver.observe(notebookElement.parentNode.parentNode.parentNode, config);

//------------------------------------------------
// Default box scrolling
//------------------------------------------------
const defaultElement = document.querySelector("#textbox-default textarea");
let defaultScrolled = false;

defaultElement.addEventListener("scroll", function() {
let diff = defaultElement.scrollHeight - defaultElement.clientHeight;
if(Math.abs(defaultElement.scrollTop - diff) <= 10 || diff == 0) {
defaultScrolled = false;
} else {
defaultScrolled = true;
}
});

const defaultObserver = new MutationObserver(function(mutations) {
mutations.forEach(function(mutation) {
if(!defaultScrolled) {
defaultElement.scrollTop = defaultElement.scrollHeight;
}
});
});

defaultObserver.observe(defaultElement.parentNode.parentNode.parentNode, config);

//------------------------------------------------
// Add some scrollbars
//------------------------------------------------
Expand Down Expand Up @@ -373,3 +325,15 @@ function toggleBigPicture() {
}
}

//------------------------------------------------
// Define the --chat-height global CSS variable to
// the height of the chat parent
//------------------------------------------------
function updateChatHeight() {
const chatContainer = document.getElementById('chat').parentNode.parentNode.parentNode;
const newChatHeight = `${chatContainer.clientHeight}px`;

document.documentElement.style.setProperty('--chat-height', newChatHeight);
}

window.addEventListener('resize', updateChatHeight);
7 changes: 3 additions & 4 deletions modules/chat.py
Original file line number Diff line number Diff line change
Expand Up @@ -210,10 +210,6 @@ def chatbot_wrapper(text, state, regenerate=False, _continue=False, loading_mess
output = copy.deepcopy(history)
output = apply_extensions('history', output)
state = apply_extensions('state', state)
if shared.model_name == 'None' or shared.model is None:
logger.error("No model is loaded! Select one in the Model tab.")
yield output
return

visible_text = None
stopping_strings = get_stopping_strings(state)
Expand Down Expand Up @@ -252,6 +248,9 @@ def chatbot_wrapper(text, state, regenerate=False, _continue=False, loading_mess
'internal': output['internal']
}

if shared.model_name == 'None' or shared.model is None:
raise ValueError("No model is loaded! Select one in the Model tab.")

# Generate the prompt
kwargs = {
'_continue': _continue,
Expand Down
1 change: 1 addition & 0 deletions modules/exllamav2.py
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,7 @@ def from_pretrained(self, path_to_model):
config.scale_pos_emb = shared.args.compress_pos_emb
config.scale_alpha_value = shared.args.alpha_value
config.no_flash_attn = shared.args.no_flash_attn
config.num_experts_per_token = int(shared.args.num_experts_per_token)

model = ExLlamaV2(config)

Expand Down
1 change: 1 addition & 0 deletions modules/exllamav2_hf.py
Original file line number Diff line number Diff line change
Expand Up @@ -165,5 +165,6 @@ def from_pretrained(cls, pretrained_model_name_or_path: Optional[Union[str, os.P
config.scale_pos_emb = shared.args.compress_pos_emb
config.scale_alpha_value = shared.args.alpha_value
config.no_flash_attn = shared.args.no_flash_attn
config.num_experts_per_token = int(shared.args.num_experts_per_token)

return Exllamav2HF(config)
33 changes: 0 additions & 33 deletions modules/grammar.py

This file was deleted.

Loading

0 comments on commit 7be0983

Please sign in to comment.