-
Notifications
You must be signed in to change notification settings - Fork 337
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Block-prune qwen2 model not works on torch-pruning #436
Comments
Hi @lifelongeeek, thanks for the information. There is indeed a bug in the Config. Wrong values were assigned to But I saw some new issues with offloading. Not sure if this is triggered by limited GPU Mem . I will check this later. |
With this commit you mentioned, I can still see the error messages for Here is the before/after pruning
|
I found that Qwen-2 pruning example is recently added in torch-pruning
: https://github.com/VainF/Torch-Pruning/tree/master/examples/LLMs#rocket-qwenqwen2-7b
Thanks for updating!
I try this script to block-pruned (4 blocks) qwen-2 architecture.
However,
when I try to prune this model,
Reloading (via
AutoModelForCausalLM.from_pretrained()
) this model fails due to eitherhidden_size
: 3152 andnum_heads
: 28).Could you suggest how to properly prune block-pruned qwen-2 model?
FYI, the following is my environment
The text was updated successfully, but these errors were encountered: