Parallel Computing with CUDA

Harnessing the Power of the GPU (CUDA) to Accelerate Jacobi Algorithm to solving System of Linear Equation and Benchmarking the Parallel Level Speedup. The Program is written in C and compiled with NVIDIA CUDA Compiler.

Hardware Config

Model	Architecture	SM Count	Shading Units	Clock speed	Memory Clock
TU106-200A-KA-A1	Turing (12nm)	30	1920	1365 MHz	1750 MHz 14000 MHz(max)

Run Config

Threads	Dataset
1,2,4,8,16,32	16,160,320,480,1024,1600

Benchmark

Limitations

Due to the GPU mechanism to throttle cores to 420 MHz while idle, It took much longer in the Performance but it preserved the ratio between different parallel levels.
Due to that the GPU was using as Display Hardware by Windows Operating System, Windows keep interrupting the program only when it take longer time to compute. This known as ‘watchdog’, also search for “Windows TDR” it is a timeout counter used by Windows to interrupt any unnecessary program running on Display Hardware. And I could not able record timing for large datasets(1600, 2080, 3200). This why Windows is not idle platform for CUDA programming.
Due to the large dataset the program didn’t get any benefit from shared memory across the threads, the shared memory per block is 48 Kilobytes. Though I tested out partial shared memory performance but it didn’t give any benefit at all, because the calculation were combination of shared and global which takes same as global memory because the cores have to wait for the global memory to read at the memory clock speed. Therefore it is understandable that the whole dataset per operation need to be in the shared memory to get the speedup benefits.

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
img		img
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Parallel Computing with CUDA

Hardware Config

Run Config

Benchmark

Limitations

About

Releases

Packages

License

rafathasan/linear-equation-cuda

Folders and files

Latest commit

History

Repository files navigation

Parallel Computing with CUDA

Hardware Config

Run Config

Benchmark

Limitations

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Packages