AssertionError of act_observer when using SmoothQuant for Llama-13b #2033

kyang-06 · 2024-10-16T07:48:52Z

When I tried smoothquant with sample code clip

from neural_compressor.torch.quantization import SmoothQuantConfig, convert, prepare
def run_fn(model):
    model(example_inputs)
quant_config = SmoothQuantConfig(alpha=0.5)
prepared_model = prepare(fp32_model, quant_config=quant_config, example_inputs=example_inputs)
run_fn(prepared_model)
q_model = convert(prepared_model)

I got the error

AssertionError                            Traceback (most recent call last)
Cell In[7], line 11
      9 quant_config = SmoothQuantConfig(alpha=0.5)
     10 print(quant_config)
---> 11 prepared_model = prepare(model, quant_config=quant_config, example_inputs=example_prompts)
     12 run_fn(prepared_model)
     13 q_model = convert(prepared_model)
...
...
File ~/anaconda3/envs/intel-arc-py39/lib/python3.9/site-packages/intel_extension_for_pytorch/quantization/_smooth_quant.py:85, in SmoothQuantActivationObserver.__init__(self, act_observer, act_ic_observer, smooth_quant_enabled, dtype, qscheme, reduce_range, quant_min, quant_max, alpha, factory_kwargs, eps)
     75     self.act_obs = HistogramObserver(
     76         dtype=dtype,
     77         qscheme=qscheme,
   (...)
     82         eps=eps,
     83     )
     84 else:
---> 85     assert isinstance(act_observer, UniformQuantizationObserverBase), 'act_observer:' + str(act_observer)
     86     self.act_obs = act_observer
     87 # if smooth_quant_enabled is false, this observer acts as
     88 # a normal per-tensor observer

AssertionError: act_observer:<class 'torch.ao.quantization.observer.MinMaxObserver'>

Below is my env

torch                            2.1.0a0+cxx11.abi
neural_compressor                3.0.2
neural_compressor_3x_pt          2.6
intel-extension-for-pytorch      2.1.10+xpu
intel-extension-for-transformers 1.2.1

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AssertionError of act_observer when using SmoothQuant for Llama-13b #2033

AssertionError of act_observer when using SmoothQuant for Llama-13b #2033

kyang-06 commented Oct 16, 2024

AssertionError of act_observer when using SmoothQuant for Llama-13b #2033

AssertionError of act_observer when using SmoothQuant for Llama-13b #2033

Comments

kyang-06 commented Oct 16, 2024