Error on 13B Model #8

DARDORKE · 2023-04-09T20:21:49Z

Hi !

I quantized the 13B model, I got a 15,15Go file. But I got an error when I try to mount it with ./main

main: seed = 1681070012 llama_model_load: loading model from '/content/vigogne/llama.cpp/models/13B/ggml-model-q4_0.bin' - please wait ... llama_model_load: n_vocab = 32000 llama_model_load: n_ctx = 2048 llama_model_load: n_embd = 5120 llama_model_load: n_mult = 256 llama_model_load: n_head = 40 llama_model_load: n_layer = 40 llama_model_load: n_rot = 128 llama_model_load: f16 = 2 llama_model_load: n_ff = 13824 llama_model_load: n_parts = 2 llama_model_load: type = 2 llama_model_load: ggml map size = 15517.64 MB llama_model_load: ggml ctx size = 101.25 KB llama_model_load: mem required = 17565.74 MB (+ 1608.00 MB per state) llama_model_load: loading tensors from '/content/vigogne/llama.cpp/models/13B/ggml-model-q4_0.bin' llama_model_load: tensor 'tok_embeddings.weight' has wrong size in model file llama_init_from_file: failed to load model main: error: failed to load model '/content/vigogne/llama.cpp/models/13B/ggml-model-q4_0.bin'

I think the problem comes from the tokenizer.model file. Where may I find the files corresponding with 13b model ?

Thank you !

The text was updated successfully, but these errors were encountered:

cmhamiche · 2023-04-16T21:02:50Z

Same I was searching for the 13B tokenizer.model file to quantize with GPTQ-for-llama

bofenghuang · 2023-04-18T14:13:00Z

Hi @DARDORKE @cmhamiche,

The 7B and 13B models should have the same tokenizer.model file. You could check huggyllama/llama-7b and huggyllama/llama-13b.

I have the following quantized files working on my PC.

models
├── 13B
│   ├── consolidated.00.pth
│   ├── ggml-model-f16.bin
│   ├── ggml-model-q4_0.bin
│   └── params.json
├── 7B
│   ├── consolidated.00.pth
│   ├── ggml-model-f16.bin
│   ├── ggml-model-q4_0.bin
│   └── params.json
└── tokenizer.model

PS: Your model file of 15.15 GB is a little strange. According to this section, the quantized 4-bits file of the 13B model should have a size of around 7.x GB.

DARDORKE · 2023-04-18T16:39:10Z

Nice thx mate !

cmhamiche · 2023-04-27T12:31:24Z

I merged and quantized the 13b model. cmh/vigogne-13b-4bit-32g-triton
Thanks a lot!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error on 13B Model #8

Error on 13B Model #8

DARDORKE commented Apr 9, 2023

cmhamiche commented Apr 16, 2023

bofenghuang commented Apr 18, 2023

DARDORKE commented Apr 18, 2023

cmhamiche commented Apr 27, 2023

Error on 13B Model #8

Error on 13B Model #8

Comments

DARDORKE commented Apr 9, 2023

cmhamiche commented Apr 16, 2023

bofenghuang commented Apr 18, 2023

DARDORKE commented Apr 18, 2023

cmhamiche commented Apr 27, 2023