This sample application performs general matrix multiplication using OpenCL(TM) CPU or GPU device, so it can be used as a target for OpenCL(TM) profiling and tracing tools.
OpenCL Matrix Multiplication (matrix size: 1024 x 1024, repeats 4 times)
Target device: Intel(R) Gen9 HD Graphics NEO
Matrix multiplication time: 0.18465 sec
Results are CORRECT with accuracy: 4.90573e-06
Matrix multiplication time: 0.1293 sec
Results are CORRECT with accuracy: 4.90573e-06
Matrix multiplication time: 0.103855 sec
Results are CORRECT with accuracy: 4.90573e-06
Matrix multiplication time: 0.0909481 sec
Results are CORRECT with accuracy: 4.90573e-06
Total execution time: 0.739879 sec
- Linux
- Windows
- CMake (version 3.12 and above)
- Git (version 1.8 and above)
- Python (version 2.7 and above)
- OpenCL(TM) ICD Loader
- Intel(R) Graphics Compute Runtime for oneAPI Level Zero and OpenCL(TM) Driver
Run the following commands to build the sample:
cd <pti>/samples/cl_gemm
mkdir build
cd build
cmake -DCMAKE_BUILD_TYPE=Release ..
make
Use this command line to run the application:
./cl_gemm [cpu|gpu] [matrix_size] [repeat_count]
Use Microsoft* Visual Studio x64 command prompt to run the following commands and build the sample:
cd <pti>\samples\cl_gemm
mkdir build
cd build
cmake -G "NMake Makefiles" -DCMAKE_BUILD_TYPE=Release -DCMAKE_LIBRARY_PATH=<opencl_icd_lib_path> ..
nmake
Use this command line to run the application:
cl_gemm.exe [cpu|gpu] [matrix_size] [repeats_count]