-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Quantized custom flux model was still bfloat16 #27
Comments
It did not work but maybe the output is correct already and I just need to convert it? I tried reproducing the flux.1-dev results and I get very similar error sizes:
|
I have the same question: how can a quantized checkpoint be converted into a safetensor format model that can be loaded in Nunchaku? hope @lmxyy can provide some assistance. |
As for your question, DeepCompressor dumps floating-point dequantized weight in the checkpoint We are currently working on a script to convert the checkpoint of DeepCompressor to the Nunchaku format. We'll keep this issue updated and notify you when the conversion script is released. Let us know if you have any specific requirements or suggestions! |
mark |
Hi, thanks for sharing your very efficient quantization method!
I was trying it out on a custom flux model and was surprised to see the saved model was the same size as the original bfloat16. I suspect the errors might be large and it decided to keep bfloat16 rather than quantizing.
When I looked in model.pt everything was bfloat16 and the
wgts.pt
file showed this:These are some logs from running quantization:
I'm trying again tonight but I suspect I will see the same issue this time.
Do you have any suggestions?
The text was updated successfully, but these errors were encountered: