-
Notifications
You must be signed in to change notification settings - Fork 181
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Why my QAT's convert doesn't work,still float32 #1290
Comments
This is because we are producing a model that's going to be lowered to executorch for speedup I think. Here is the doc for QAT: https://pytorch.org/blog/quantization-aware-training/ |
Do I need to use torch.quantization.convert() to quantize my model to int8? |
Hi @Xxxgrey, I see you printed the parameter dtypes twice:
(Note that this is in int8 not int4 because torch.int4 wasn't natively supported yet when this flow was built. We will update this in the future.) |
I got it ,so is there any qat method that is not only quantize the
I got it ,so is there any qat method that is not only quantize the liner layers? I only see these. Seems like they are all only quantize the linear layers.
|
Yeah today we only have support for linear and embedding layers:
If you need other bit-width / quantization schemes, you can also use the generic |
I try the original QAT code.
but the result shows
It doesn't convert to int8 or int4.
The text was updated successfully, but these errors were encountered: