error in trt_llm conversion #32

Andy0422 · 2024-12-03T10:34:17Z

hi,

An error found in the trt-llm conversion from deepcompressor format to trt format of llm llama2-7b w4a8 channel.

[TensorRT-LLM] TensorRT-LLM version: 0.16.0.dev2024111900
0.16.0.dev2024111900
[12/03/2024-09:20:45] [TRT-LLM] [I] Loading weights from lmquant torch checkpoint for QServe W4A8 inference...
[12/03/2024-09:20:49] [TRT-LLM] [I] Processing weights in layer: 0
Traceback (most recent call last):
File "/home/wei.zhao/work/TensorRT-LLM-241125/examples/llama/convert_checkpoint.py", line 555, in
main()
File "/home/wei.zhao/work/TensorRT-LLM-241125/examples/llama/convert_checkpoint.py", line 547, in main
convert_and_save_hf(args)
File "/home/wei.zhao/work/TensorRT-LLM-241125/examples/llama/convert_checkpoint.py", line 488, in convert_and_save_hf
execute(args.workers, [convert_and_save_rank] * world_size, args)
File "/home/wei.zhao/work/TensorRT-LLM-241125/examples/llama/convert_checkpoint.py", line 495, in execute
f(args, rank)
File "/home/wei.zhao/work/TensorRT-LLM-241125/examples/llama/convert_checkpoint.py", line 472, in convert_and_save_rank
llama = LLaMAForCausalLM.from_hugging_face(
File "/usr/local/lib/python3.10/dist-packages/tensorrt_llm/models/llama/model.py", line 416, in from_hugging_face
weights = load_weights_from_lmquant(quant_ckpt_path, config)
File "/usr/local/lib/python3.10/dist-packages/tensorrt_llm/models/llama/convert.py", line 2073, in load_weights_from_lmquant
v = [
File "/usr/local/lib/python3.10/dist-packages/tensorrt_llm/models/llama/convert.py", line 2074, in
load(f'{prefix}.self_attn.{comp}_proj.{suffix}')
File "/usr/local/lib/python3.10/dist-packages/tensorrt_llm/models/llama/convert.py", line 1952, in load
v = quant_params[key]
KeyError: 'model.layers.0.self_attn.q_proj.weight.zero'

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

error in trt_llm conversion #32

error in trt_llm conversion #32

Andy0422 commented Dec 3, 2024 •

edited

Loading

error in trt_llm conversion #32

error in trt_llm conversion #32

Comments

Andy0422 commented Dec 3, 2024 • edited Loading

Andy0422 commented Dec 3, 2024 •

edited

Loading