Build applications written in NVIDIA® CUDA™ code for OpenCL™ 1.2 devices.
- leave applications in NVIDIA® CUDA™
- compile into OpenCL 1.2
- run on any OpenCL 1.2 GPU
- Write an NVIDIA® CUDA™ sourcecode file, or find an existing one
- Let's use cuda_sample.cu
- Compile, using
cocl
:
$ cocl cuda_sample.cu
...
... (bunch of compily stuff) ...
...
./cuda_sample.cu compiled into ./cuda_sample
Run:
$ ./cuda_sample
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics 5500 BroadWell U-Processor GT2
hostFloats[2] 123
hostFloats[2] 222
hostFloats[2] 444
- compiler for host-side code, including memory allocation, copy, streams, kernel launches
- compiler for device-side code, handling templated C++ code, converting it into bog-standard OpenCL 1.2 code
- cuBLAS API implementations for GEMM, GEMV, SCAL, SAXPY (using Cedric Nugteren's CLBlast)
- cuDNN API implementations for: convolutions (using
im2col
algorithm over Cedric Nugteren's CLBlast, pooling, ReLU, tanh, and sigmoid
Kernel compilation proceeds in two steps:
Slides on the IWOCL website, here
Coriander development is carried out using the following platforms:
- Ubuntu 16.04, with:
- NVIDIA K80 GPU and/or NVIDIA K520 GPU (via aws)
- Mac Book Pro 4th generation (thank you ASAPP :-) ), with:
- Intel HD Graphics 530
- Radeon Pro 450
- Sierra OS
Other systems should work too, ideally. You will need at a minimum at least one OpenCL-enabled GPU, and appropriate OpenCL drivers installed, for the GPU. Both linux and Mac systems stand a reasonable chance of working ok.
For installation, please see installation
- use
cocl_add_executable
andcocl_add_library
- see cmake usage
See testing
See assumptions
Coriander uses the following libraries:
- clang/llvm: c/c++ parser/compiler; many contributors
- CLBlast: Cedric Nugteren's excellent BLAS for OpenCL
- thrust: parallel GPU library, from NVIDIA®
- yaml-cpp: yaml for c++, by Jesse Beder
- EasyCL: wrapper for OpenCL 1.2 boilerplate
- argparsecpp: command-line parser for c++
- gtest: unit tests for c++, from Google
- Eigen-CL: Minimally-tweaked fork of Eigen, for OpenCL 1.2
- tf-coriander: Tensorflow for OpenCL-1.2
Please cite: CUDA-on-CL: a compiler and runtime for running NVIDIA® CUDA™ C++11 applications on OpenCL™ 1.2 Devices
- June 4:
- added cmake macros
cocl_add_executable
andcocl_add_library
- these replace the previous
add_cocl_executable
, and have the advantage that they are standard targets, that you can use withtarget_link_libraries
and so on - see cmake usage
- added cmake macros
- May 31:
- added a developer debugging option
COCL_DUMP_CONFIG
, to allow easy inspection of buffers returned by kernel calls, see options
- added a developer debugging option
- Older news