[ Back to index ]

Build Nvidia Docker Container (from 3.1 Inference round)

cm docker script --tags=build,nvidia,inference,server

Run this benchmark via CM

Do a test run to detect and record the system performance

cmr "run-mlperf inference _find-performance _all-scenarios" \
--model=gptj-99 --implementation=nvidia-original --device=cuda --backend=tensorrt \
--category=edge --division=open --quiet

Use --division=closed to run all scenarios for the closed division.
Use --category=datacenter to run datacenter scenarios
Use --model=gptj-99.9 to run the high-accuracy model
Use --rerun to force a rerun even when result files (from a previous run) exist

Do full accuracy and performance runs for all the scenarios

cmr "run-mlperf inference _submission _all-scenarios" --model=gptj-99 \
--device=cuda --implementation=nvidia-original --backend=tensorrt \
--execution-mode=valid \
--category=edge --division=open --quiet --skip_submission_generation=yes

Use --power=yes for measuring power. It is ignored for accuracy and compliance runs
Use --division=closed to run all scenarios for the closed division. No compliance runs are there for gptj.
--offline_target_qps, --server_target_qps, and --singlestream_target_latency can be used to override the determined performance numbers
Use --model=gptj-99.9 to run the high-accuracy model
Use --rerun to force a rerun even when result files (from a previous run) exist

Generate and upload MLPerf submission

Follow this guide to generate the submission tree and upload your results.

Questions? Suggestions?

Check the MLCommons Task Force on Automation and Reproducibility and get in touch via public Discord server.

Acknowledgments

CM automation for Nvidia's MLPerf inference implementation was developed by Arjun Suresh and Grigori Fursin.
Nvidia's MLPerf inference implementation was developed by Zhihan Jiang, Ethan Cheng, Yiheng Zhang and Jinho Suh.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README_nvidia.md

README_nvidia.md

Build Nvidia Docker Container (from 3.1 Inference round)

Run this benchmark via CM

Do a test run to detect and record the system performance

Do full accuracy and performance runs for all the scenarios

Generate and upload MLPerf submission

Questions? Suggestions?

Acknowledgments

Files

README_nvidia.md

Latest commit

History

README_nvidia.md

File metadata and controls

Build Nvidia Docker Container (from 3.1 Inference round)

Run this benchmark via CM

Do a test run to detect and record the system performance

Do full accuracy and performance runs for all the scenarios

Generate and upload MLPerf submission

Questions? Suggestions?

Acknowledgments