This document describes the step-by-step instructions for reproducing stable diffusion tuning results with Intel® Neural Compressor.
The script run_diffusion.py
is based on huggingface/diffusers and provides post-training static quantization approach based on Intel® Neural Compressor.
pip install -r requirements.txt
Note: Validated PyTorch Version.
FID metric is used to evaluate the model in this case, so we should download training datasets and choose one image to a directory(like "base_images").
Note: In this case we used picture: Ground_Truth_Image.
python run_diffusion.py \
--model_name_or_path lambdalabs/sd-pokemon-diffusers \
--tune \
--quantization_approach PostTrainingStatic \
--perf_tol 0.02 \
--output_dir /tmp/diffusion_output \
--base_images base_images \
--input_text "a drawing of a gray and black dragon" \
--calib_text "a drawing of a green pokemon with red eyes"
The ground truth image:
The image generated by original model(FID with ground truth: 333):
The image generated by quantized UNet(FID with ground truth: 246):
Original model
python run_diffusion.py \
--model_name_or_path lambdalabs/sd-pokemon-diffusers \
--output_dir /tmp/diffusion_output \
--base_images base_images \
--benchmark
The model of quantized UNet
python run_diffusion.py \
--model_name_or_path lambdalabs/sd-pokemon-diffusers \
--output_dir /tmp/diffusion_output \
--base_images base_images \
--benchmark \
--int8
Note: Inference performance speedup with Intel DL Boost (VNNI) on Intel(R) Xeon(R) hardware, Please refer to Performance Tuning Guide for more optimizations.