-
Notifications
You must be signed in to change notification settings - Fork 86
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Question about linear quantization #14
Comments
@87Candy I have not encountered this error. |
There are two methods, one is the K-means quantification method, the other is the linear quantification, I would like to ask, when you ran through the linear quantization, what changes have been made to the entire project file? |
@87Candy The error may be caused by |
some another question maybe encouter,could I communicate with you,one more time? |
How to covert mobilenet v2 in to qmobilenetv2? |
@alan303138 See haq/lib/env/linear_quantize_env.py Line 115 in 8228d12
|
Thank you for your reply, And I also used the QConv2d and QLinear provided by them to make a pretrained qmobilnetv2, but I am not sure if this is correct, because it is not usually necessary to use fp32 training and then convert it to a quantized model?(Like quantization aware training) haq/lib/utils/quantize_utils.py Line 454 in 8228d12
# If my current model is mobilenetv2 it will not do calibrate,because the implementation inside is nn.Conv2d, nn.Linear haq/lib/utils/quantize_utils.py Line 455 in 8228d12
|
|
@frankinwi
mobilenetv2 Line 169 in 8228d12
qmobilenetv2 Line 183 in 8228d12
|
Putting them all together, if we do not use half-precision (fp16, see --half flag) and do not specify the w_bit and a_bit for each QConv2d and QLinear layer, the qmobilenetv2 will not be quantized. According to run_pretrain.sh and pretrain.py, the pre-trained file mobiletv2-150.pth.tar seems to use fp16. Therefore, the mobiletv2-150.pth.tar file might be unsuitable for linear quantization. You can load the mobiletv2-150.pth.tar and insert some prints before Line 192 in 8228d12
|
I figure out the procedure of linear quantization and reproduce the experiments,
It seems like the final accuracy of the quantized model is more dependent on the fine-tuning.
Another question is why the bit reduction process starts from the last layer as the
_final_action_wall
function shows.The text was updated successfully, but these errors were encountered: