Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

error in trt_llm conversion #32

Open
Andy0422 opened this issue Dec 3, 2024 · 0 comments
Open

error in trt_llm conversion #32

Andy0422 opened this issue Dec 3, 2024 · 0 comments

Comments

@Andy0422
Copy link

Andy0422 commented Dec 3, 2024

hi,

An error found in the trt-llm conversion from deepcompressor format to trt format of llm llama2-7b w4a8 channel.

[TensorRT-LLM] TensorRT-LLM version: 0.16.0.dev2024111900
0.16.0.dev2024111900
[12/03/2024-09:20:45] [TRT-LLM] [I] Loading weights from lmquant torch checkpoint for QServe W4A8 inference...
[12/03/2024-09:20:49] [TRT-LLM] [I] Processing weights in layer: 0
Traceback (most recent call last):
File "/home/wei.zhao/work/TensorRT-LLM-241125/examples/llama/convert_checkpoint.py", line 555, in
main()
File "/home/wei.zhao/work/TensorRT-LLM-241125/examples/llama/convert_checkpoint.py", line 547, in main
convert_and_save_hf(args)
File "/home/wei.zhao/work/TensorRT-LLM-241125/examples/llama/convert_checkpoint.py", line 488, in convert_and_save_hf
execute(args.workers, [convert_and_save_rank] * world_size, args)
File "/home/wei.zhao/work/TensorRT-LLM-241125/examples/llama/convert_checkpoint.py", line 495, in execute
f(args, rank)
File "/home/wei.zhao/work/TensorRT-LLM-241125/examples/llama/convert_checkpoint.py", line 472, in convert_and_save_rank
llama = LLaMAForCausalLM.from_hugging_face(
File "/usr/local/lib/python3.10/dist-packages/tensorrt_llm/models/llama/model.py", line 416, in from_hugging_face
weights = load_weights_from_lmquant(quant_ckpt_path, config)
File "/usr/local/lib/python3.10/dist-packages/tensorrt_llm/models/llama/convert.py", line 2073, in load_weights_from_lmquant
v = [
File "/usr/local/lib/python3.10/dist-packages/tensorrt_llm/models/llama/convert.py", line 2074, in
load(f'{prefix}.self_attn.{comp}_proj.{suffix}')
File "/usr/local/lib/python3.10/dist-packages/tensorrt_llm/models/llama/convert.py", line 1952, in load
v = quant_params[key]
KeyError: 'model.layers.0.self_attn.q_proj.weight.zero'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant