Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SYCL][CUDA] performance issue with a SYCL program #16696

Open
jinz2014 opened this issue Jan 20, 2025 · 0 comments
Open

[SYCL][CUDA] performance issue with a SYCL program #16696

jinz2014 opened this issue Jan 20, 2025 · 0 comments
Labels
cuda CUDA back-end performance Performance related issues

Comments

@jinz2014
Copy link
Contributor

Hello
There seems a performance gap between the CUDA and SYCL programs on an NVIDIA A100 GPU.
I tried Syclomatic, but the translation was not successful.

https://github.com/zjin-lcf/HeCBench/tree/master/src/scatter-cuda

CUDA (12.5)

./main 10000000 100
INT32 scatter (mul, div, sum, min, max)
Average execution time of kernel: 609.347046 (us)
Average execution time of kernel: 513.615234 (us)
Average execution time of kernel: 224.066589 (us)
Average execution time of kernel: 224.341263 (us)
Average execution time of kernel: 224.259125 (us)

https://github.com/zjin-lcf/HeCBench/tree/master/src/scatter-sycl

SYCL (icpx 2025.0.0)
./main 10000000 100
INT32 scatter (mul, div, sum, min, max)
Average execution time of kernel: 5594.654785 (us)
Average execution time of kernel: 5526.372559 (us)
Average execution time of kernel: 5501.559570 (us)
Average execution time of kernel: 5502.131348 (us)
Average execution time of kernel: 5501.163086 (us)

@AlexeySachkov AlexeySachkov added performance Performance related issues cuda CUDA back-end labels Jan 28, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cuda CUDA back-end performance Performance related issues
Projects
None yet
Development

No branches or pull requests

2 participants