-
Notifications
You must be signed in to change notification settings - Fork 487
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix AutoModel can't load gptq model due to module prefix mismatch vs AutoModelForCausalLM #2146
Fix AutoModel can't load gptq model due to module prefix mismatch vs AutoModelForCausalLM #2146
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SGTM !
@SunMarc We found this bug while submitting a test 1B quantized model to HF OpenLLM leaderbord
https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard#/add model: https://huggingface.co/ModelCloud/Llama-3.2-1B-Instruct-gptqmodel-4bit-vortex-v1
We are trying to have it test for our vortex high recovery gptq models but I don't believe existing runner will work with gptq models even if this PR is merged since it is most likely lacking autogptq (and future gptqmodel) pkgs. |
The tests are performed on a h100 gpu and normally, if nothing changed yet, it should install autogptq. In the previous leaderboard, lots of gptq models were evaluated. cc @alozowski do you know what is happening this is model ? |
A HF Open LLM Leaderbord maintainer here! Sorry for my late reply, indeed, our evaluation queue got stuck, but we fixed it this morning I can confirm that we have a request file for Also, feel free to open a discussion about this model in our Community section so we can discuss the model evaluation there |
LGTM thanks for the fix ! will wait for GPTQ tests to pass. |
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
Failures are irrelevant to the PR. |
@alozowski Thanks for the update. Can you confirm the failure is caused by the bug that his PR fixed? If error is unrelated to this bug fix, I will move discussion to the leaderboard community board. |
@alozowski We need update on this. Please respond:
There are 0 models on the leaderboard that I can see that are We are willing to fix this end everything related to gptq testing for leaderboard, if related to HF code, if we can get some debug feedback. |
Hi @Qubitium! |
What does this PR do?
This PR fixes the issue encountered when using AutoModel to load the GPTQ model, which caused this error:
The reason for this error is models loaded by AutoModel have different block prefixes than models loaded by AutoModelForCausalLM. For example, in the Llama model, the modules after loading with AutoModel are
'layers.0.self_attn.q_proj', 'layers.0.self_attn.k_proj', 'layers.0.self_attn.v_proj', etc
. In the AutoModelForCausalLM, the modules after loading are'model.layers.0.self_attn.q_proj', 'model.layers.0.self_attn.k_proj', 'model.layers.0.self_attn.v_proj', etc
. They have different prefixes, but they correspond to the same module.Who can review?
@Qubitium @SunMarc