Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sync master #8

Merged
merged 1 commit into from
May 28, 2024
Merged

sync master #8

merged 1 commit into from
May 28, 2024

Commits on May 28, 2024

  1. llama : support small Granite models (ggerganov#7481)

    * Add optional MLP bias for Granite models
    
    Add optional MLP bias for ARCH_LLAMA to support Granite models.
    Partially addresses ggerganov/issues/7116
    Still needs some more changes to properly support Granite.
    
    * llama: honor add_space_prefix from the model configuration
    
    propagate the add_space_prefix configuration from the HF model
    configuration to the gguf file and honor it with the gpt2 tokenizer.
    
    Signed-off-by: Giuseppe Scrivano <[email protected]>
    
    * llama: add support for small granite models
    
    it works only for the small models 3b and 8b.
    
    The convert-hf-to-gguf.py script uses the vocabulary size of the
    granite models to detect granite and set the correct configuration.
    
    Signed-off-by: Giuseppe Scrivano <[email protected]>
    
    ---------
    
    Signed-off-by: Giuseppe Scrivano <[email protected]>
    Co-authored-by: Steffen Roecker <[email protected]>
    giuseppe and sroecker authored May 28, 2024
    Configuration menu
    Copy the full SHA
    5442939 View commit details
    Browse the repository at this point in the history