The benchmarking feature of Neural Compressor is used to measure the model performance with the objective settings. Users can get the performance of the float32 model and the optimized low precision model in the same scenarios.
Environment | Category |
---|---|
Operating System | linux |
windows | |
Architecture | x86_64 |
aarch64 | |
gpu |
Benchmark provide capability to automatically run with multiple instance through cores_per_instance
and num_of_instance
config (CPU only).
And please make sure cores_per_instance * num_of_instance
must be less than CPU physical core numbers.
benchmark.fit
accept b_dataloader
or b_func
as input.
b_func
is customized benchmark function. If user passes the b_dataloader
, then b_func
is not required.
from neural_compressor.config import BenchmarkConfig
from neural_compressor.benchmark import fit
conf = BenchmarkConfig(warmup=10, iteration=100, cores_per_instance=4, num_of_instance=7)
fit(model="./int8.pb", conf=conf, b_dataloader=eval_dataloader)
Refer to the Benchmark example.