Skip to content

An introduction to CUDA programming by way of a Boids Flocking simulation

Notifications You must be signed in to change notification settings

czxcjx/Project1-CUDA-Flocking

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

University of Pennsylvania, CIS 565: GPU Programming and Architecture, Project 1 - Flocking

  • Name: Zhan Xiong Chin
  • Tested on: Windows 7 Professional, Intel(R) Xeon(R) CPU E5-1630 v4 @ 3.70 GHz 3.70 GHz, GTX 1070 8192MB (SIG Lab)

10000 particle simulation

500000 particle simulation

Build Instructions

See here

Performance analysis

5000 particles 10000 particles 50000 particles 500000 particles
Naive search 700 fps 350 fps 20 fps (crashes)
Scattered uniform grid 770 fps 1100 fps 490 fps 6 fps
Coherent uniform grid 770 fps 1100 fps 1100 fps 72 fps
  • For the naive search, increasing the number of boids decreases performance, which is expected, as more computations need to be done to figure out velocity changes when the number of boids increases.

  • However, for the uniform grids, the performance increases when going from 5000 to 10000 particles, but decreases after that. This may be related to branching effects: there may not be empty grid cells with a sufficient number of particles.

  • Increasing the block size does not change the performance of the uniform grid implementation significantly. This is because the overall utilization of the device remains similar.

  • The coherent uniform grid had significant performance improvement over the scattered uniform grid. This was expected, since this takes better advantage of caching and avoids expensive memory accesses.

About

An introduction to CUDA programming by way of a Boids Flocking simulation

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • CMake 69.0%
  • Cuda 16.0%
  • C++ 14.2%
  • Other 0.8%