TransferBench

TransferBench is a simple utility capable of benchmarking simultaneous copies between user-specified devices (CPUs/GPUs).

Requirements

To build TransferBench using Makefile:

$ make

To build TransferBench using cmake:

$ mkdir build
$ cd build
$ CXX=/opt/rocm/bin/hipcc cmake ..
$ make

If ROCm is installed in a folder other than /opt/rocm/, set ROCM_PATH appropriately

TransferBench may also be built to run on NVIDIA platforms via HIP, but requires a HIP-compatible CUDA version installed (e.g. CUDA 11.5)

To build:

   CUDA_PATH=<path_to_CUDA> HIP_PLATFORM=nvidia make`

Running TransferBench with no arguments will display usage instructions and detected topology information
There are several preset configurations that can be used instead of a configuration file including:
- p2p - Peer to peer benchmark test
- sweep - Sweep across possible sets of Transfers
- rsweep - Random sweep across possible sets of Transfers
When using the same GPU executor in multiple simultaneous Transfers, performance may be serialized due to the maximum number of hardware queues available.
- The number of maximum hardware queues can be adjusted via GPU_MAX_HW_QUEUES
- Alternatively, running in single stream mode (USE_SINGLE_STREAM=1) may avoid this issue by launching all Transfers on a single stream instead of individual streams

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
examples		examples
src		src
CHANGELOG.md		CHANGELOG.md
CMakeLists.txt		CMakeLists.txt
LICENSE.md		LICENSE.md
Makefile		Makefile
README.md		README.md