You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
An error found in the trt-llm conversion from deepcompressor format to trt format of llm llama2-7b w4a8 channel.
[TensorRT-LLM] TensorRT-LLM version: 0.16.0.dev2024111900
0.16.0.dev2024111900
[12/03/2024-09:20:45] [TRT-LLM] [I] Loading weights from lmquant torch checkpoint for QServe W4A8 inference...
[12/03/2024-09:20:49] [TRT-LLM] [I] Processing weights in layer: 0
Traceback (most recent call last):
File "/home/wei.zhao/work/TensorRT-LLM-241125/examples/llama/convert_checkpoint.py", line 555, in
main()
File "/home/wei.zhao/work/TensorRT-LLM-241125/examples/llama/convert_checkpoint.py", line 547, in main
convert_and_save_hf(args)
File "/home/wei.zhao/work/TensorRT-LLM-241125/examples/llama/convert_checkpoint.py", line 488, in convert_and_save_hf
execute(args.workers, [convert_and_save_rank] * world_size, args)
File "/home/wei.zhao/work/TensorRT-LLM-241125/examples/llama/convert_checkpoint.py", line 495, in execute
f(args, rank)
File "/home/wei.zhao/work/TensorRT-LLM-241125/examples/llama/convert_checkpoint.py", line 472, in convert_and_save_rank
llama = LLaMAForCausalLM.from_hugging_face(
File "/usr/local/lib/python3.10/dist-packages/tensorrt_llm/models/llama/model.py", line 416, in from_hugging_face
weights = load_weights_from_lmquant(quant_ckpt_path, config)
File "/usr/local/lib/python3.10/dist-packages/tensorrt_llm/models/llama/convert.py", line 2073, in load_weights_from_lmquant
v = [
File "/usr/local/lib/python3.10/dist-packages/tensorrt_llm/models/llama/convert.py", line 2074, in
load(f'{prefix}.self_attn.{comp}_proj.{suffix}')
File "/usr/local/lib/python3.10/dist-packages/tensorrt_llm/models/llama/convert.py", line 1952, in load
v = quant_params[key]
KeyError: 'model.layers.0.self_attn.q_proj.weight.zero'
The text was updated successfully, but these errors were encountered:
hi,
An error found in the trt-llm conversion from deepcompressor format to trt format of llm llama2-7b w4a8 channel.
[TensorRT-LLM] TensorRT-LLM version: 0.16.0.dev2024111900
0.16.0.dev2024111900
[12/03/2024-09:20:45] [TRT-LLM] [I] Loading weights from lmquant torch checkpoint for QServe W4A8 inference...
[12/03/2024-09:20:49] [TRT-LLM] [I] Processing weights in layer: 0
Traceback (most recent call last):
File "/home/wei.zhao/work/TensorRT-LLM-241125/examples/llama/convert_checkpoint.py", line 555, in
main()
File "/home/wei.zhao/work/TensorRT-LLM-241125/examples/llama/convert_checkpoint.py", line 547, in main
convert_and_save_hf(args)
File "/home/wei.zhao/work/TensorRT-LLM-241125/examples/llama/convert_checkpoint.py", line 488, in convert_and_save_hf
execute(args.workers, [convert_and_save_rank] * world_size, args)
File "/home/wei.zhao/work/TensorRT-LLM-241125/examples/llama/convert_checkpoint.py", line 495, in execute
f(args, rank)
File "/home/wei.zhao/work/TensorRT-LLM-241125/examples/llama/convert_checkpoint.py", line 472, in convert_and_save_rank
llama = LLaMAForCausalLM.from_hugging_face(
File "/usr/local/lib/python3.10/dist-packages/tensorrt_llm/models/llama/model.py", line 416, in from_hugging_face
weights = load_weights_from_lmquant(quant_ckpt_path, config)
File "/usr/local/lib/python3.10/dist-packages/tensorrt_llm/models/llama/convert.py", line 2073, in load_weights_from_lmquant
v = [
File "/usr/local/lib/python3.10/dist-packages/tensorrt_llm/models/llama/convert.py", line 2074, in
load(f'{prefix}.self_attn.{comp}_proj.{suffix}')
File "/usr/local/lib/python3.10/dist-packages/tensorrt_llm/models/llama/convert.py", line 1952, in load
v = quant_params[key]
KeyError: 'model.layers.0.self_attn.q_proj.weight.zero'
The text was updated successfully, but these errors were encountered: