Regarding paper and codes #7

yzhang93 · 2019-09-23T21:13:54Z

By diving deep into the codes and the paper, I have two questions.

I've read from the paper that "If the current policy exceeds our resource budget (on latency, energy or model size), we will sequentially decrease the bitwidth of each layer until the
constraint is finally satisfied." Where in the codes correspond to this statement "decrease the bitwidth of the layer when the current policy exceeds budget?"
Why don't you use the k-means quantization for latency/energy constraint experiments? Will you release codes for linear quantization?

haibao-yu · 2019-10-27T02:24:08Z

Hi, I also find the second question.
And Did you reappear the quantization method? I reappear the quantization method based on cifar10+resner20 as 3.4 of the paper; however, this linear quantization method didn't work.

lydiaji · 2019-12-23T05:07:29Z

I find that the codes using k-means quantization while in the paper it says find the optimal clip value to minimize the KL divergence between non-quantized and quantized weight/activation, in the paper it means the linear quantization, which is different as shown in the codes.

mepeichun · 2020-01-01T03:02:49Z

I find that the codes using k-means quantization while in the paper it says find the optimal clip value to minimize the KL divergence between non-quantized and quantized weight/activation, in the paper it means the linear quantization, which is different as shown in the codes.

This confuses me as well. The paper uses linear quantization, but the code provides k-means quantization (similar to the "deep compression"). After k-means quantization, we cannot guarantee that the weights are fixed point arithmetic units.

lcmeng · 2020-03-03T19:13:27Z

It's quite unfortunate that the main novelty claimed by the paper, i.e., the use of direct hardware feedback, is conveniently missing in this repo. In fact, even the paper failed to provide a clear explanation on that claim.

kuan-wang · 2020-05-05T17:05:42Z

We have updated the linear quantization as well as the hardware resource-constrained part in this repo. Please let us know if you have any questions.

lcmeng · 2020-05-05T18:45:02Z

Can you please point to the part where the direct HW feedback is used? Thanks. Without that, the repo is still quite limited in significance.

kuan-wang · 2020-05-06T07:55:13Z

Thanks for your feedback! You can view the related code refer to

haq/lib/env/linear_quantize_env.py

Line 306 in 7141586

def _get_lookuptable(self):

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Regarding paper and codes #7

Regarding paper and codes #7

yzhang93 commented Sep 23, 2019

haibao-yu commented Oct 27, 2019

lydiaji commented Dec 23, 2019

mepeichun commented Jan 1, 2020

lcmeng commented Mar 3, 2020

kuan-wang commented May 5, 2020

lcmeng commented May 5, 2020

kuan-wang commented May 6, 2020

Regarding paper and codes #7

Regarding paper and codes #7

Comments

yzhang93 commented Sep 23, 2019

haibao-yu commented Oct 27, 2019

lydiaji commented Dec 23, 2019

mepeichun commented Jan 1, 2020

lcmeng commented Mar 3, 2020

kuan-wang commented May 5, 2020

lcmeng commented May 5, 2020

kuan-wang commented May 6, 2020