Name		Name	Last commit message	Last commit date
parent directory ..
README.md		README.md
main.py		main.py
prepare_model.py		prepare_model.py
requirements.txt		requirements.txt
run_benchmark.sh		run_benchmark.sh
run_quant.sh		run_quant.sh
trainer_qa.py		trainer_qa.py
utils_model.py		utils_model.py
utils_qa.py		utils_qa.py

README.md

Step-by-Step

This example load a language translation model and confirm its accuracy and speed based on SQuAD task.

Prerequisite

1. Environment

pip install neural-compressor
pip install -r requirements.txt

Note: Validated ONNX Runtime Version.

2. Prepare Model

Supported model identifier from huggingface.co:

Model Identifier
mrm8488/spanbert-finetuned-squadv1
salti/bert-base-multilingual-cased-finetuned-squad
distilbert-base-uncased-distilled-squad
bert-large-uncased-whole-word-masking-finetuned-squad
deepset/roberta-large-squad2

python prepare_model.py --input_model=mrm8488/spanbert-finetuned-squadv1 --output_model=spanbert-finetuned-squadv1.onnx # or other supported model identifier

3. Prepare Dataset

Download SQuAD dataset from SQuAD dataset link.

Run

1. Quantization

Dynamic quantization:

bash run_quant.sh --input_model=/path/to/model \ # model path as *.onnx
                   --output_model=/path/to/model_tune

2. Benchmark

bash run_benchmark.sh --input_model=/path/to/model \ # model path as *.onnx
                      --batch_size=batch_size \
                      --mode=performance # or accuracy

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ptq_dynamic

ptq_dynamic

README.md

Step-by-Step

Prerequisite

1. Environment

2. Prepare Model

3. Prepare Dataset

Run

1. Quantization

2. Benchmark

Files

ptq_dynamic

Directory actions

More options

Directory actions

More options

Latest commit

History

ptq_dynamic

Folders and files

parent directory

README.md

Step-by-Step

Prerequisite

1. Environment

2. Prepare Model

3. Prepare Dataset

Run

1. Quantization

2. Benchmark