Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Project 6: Gabriel Naghi #15

Open
wants to merge 17 commits into
base: master
Choose a base branch
from
17 changes: 2 additions & 15 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,16 +1,3 @@
Vulkan Flocking: compute and shading in one pipeline!
======================
Vulkan Flocking: compute and shading in one pipeline!======================**University of Pennsylvania, CIS 565: GPU Programming and Architecture, Project 6*** Gabriel Naghi* Windows 7, Xeon E5-1630 @ 3.70GHz 32GB, GeForce GTX 1070 4095MB (SIG Lab MOR103-56)![](img/boids.gif)##Vulkan BoidsVulkan in Khronos Group's new explicit API powering the latest generation of games. One of the interesting new features of the Vulkan API is that it provides an interface to do compute work as well as graphics. Simultaneous graphics and compute workloads are becoming increasingly important; it is believed that this is one of the main advantages held by AMD GPUs over those of NVIDIA. In this project, I run display the power of side-by-side graphics and compute with a very simple Boid Flocking re-implementation. This is the same algorithm as that implemented in [Project 1](https://github.com/gabenaghi/Project1-CUDA-Flocking), except in 2D. ##Food for Thought####Why do you think Vulkan expects explicit descriptors for things like generating pipelines and commands?Command buffers often live in pre-allocated GPU memory, which are allocated as command pools. These command pools can be highly optimized, depending on the type of commands that will live within. Vulkan thus expects explicit descriptors, so that the GPU can heavily optimize the memory they will live in. ####Describe a situation besides flip-flop buffers in which you may need multiple descriptor sets to fit one descriptor layout.Another scenario where you might need multiple descriptor sets with a single layout is if you have several data pools. For example, one might have several data pools from which he or she might want to draw data and, depending on the execution state, draw from any one of them. In this case, it is possible for them to define several descriptor sets and select whichever points to the data they deem necessary. ####What are some problems to keep in mind when using multiple Vulkan queues?When using multiple Vulkan queues, there are a couple problems to keep in mind. For one, since the queues are backed by physical GPU hardware, it is worth thinking about where precisely the queue is vs. where the queue is used. For example, if a queue is stored on one GPU and accessed from a second networked GPU, it might lead to poor performance. Additionally, one must beware of race conditions with the explicit threading capabilities provided by Vulkan. In particular, since it is possible to submit a single buffer for work in multiple queues, undefined outcomes will result if the buffers are not protected by some mutual exclusion mechanism. ####What is one advantage of using compute commands that can share data with a rendering pipeline?One huge advantage of compute commands sharing data with rendering pipelines is locality. Assuming the compiler and/or scheduler is intelligent enough, data should nearly always be local to any program which is executing on it, accelerating memory access and mitigating bandwidth bottlenecking. ## Bloopers

**University of Pennsylvania, CIS 565: GPU Programming and Architecture, Project 6**

* (TODO) YOUR NAME HERE
Windows 22, i7-2222 @ 2.22GHz 22GB, GTX 222 222MB (Moore 2222 Lab)

### (TODO: Your README)

Include screenshots, analysis, etc. (Remember, this is public, so don't put
anything here that you don't want to share with the world.)

### Credits

* [Vulkan examples and demos](https://github.com/SaschaWillems/Vulkan) by [@SaschaWillems](https://github.com/SaschaWillems)
It seems that I either did not learn my lesson or otherwise neglected to correct some aspect of my flocking algorithm from Project 1. I again created the black-hole boids that sucked in every other boid in some event horizon. Watch the video black-hole-boids.flv in the img/ directory to see what that looked like. ### Credits* [Vulkan examples and demos](https://github.com/SaschaWillems/Vulkan) by [@SaschaWillems](https://github.com/SaschaWillems)
Expand Down
115 changes: 115 additions & 0 deletions data/shaders/computeparticles/flockingParticle.comp
Original file line number Diff line number Diff line change
@@ -0,0 +1,115 @@
#version 450

#extension GL_ARB_separate_shader_objects : enable
#extension GL_ARB_shading_language_420pack : enable

struct Particle
{
vec2 pos;
vec2 vel;
};

// LOOK: These bindings correspond to the DescriptorSetLayouts and
// the DescriptorSets from prepareCompute()!

// Binding 0 : Particle storage buffer (read)
layout(std140, binding = 0) buffer ParticlesA
{
Particle particlesA[ ];
};

// Binding 1 : Particle storage buffer (write)
layout(std140, binding = 1) buffer ParticlesB
{
Particle particlesB[ ];
};

layout (local_size_x = 16, local_size_y = 16) in;

// LOOK: rule weights and distances, as well as particle count, based off uniforms.
// The deltaT here has to be updated every frame to account for changes in
// frame rate.
layout (binding = 2) uniform UBO
{
float deltaT;
float rule1Distance;
float rule2Distance;
float rule3Distance;
float rule1Scale;
float rule2Scale;
float rule3Scale;
int particleCount;
} ubo;

void main()
{
// Current SSBO index
uint index = gl_GlobalInvocationID.x;
// Don't try to write beyond particle count
if (index >= ubo.particleCount)
return;

// Read position and velocity
vec2 vPos = particlesA[index].pos.xy;
vec2 vVel = particlesA[index].vel.xy;

vec2 centerOfMass = vec2(0.0f, 0.0f); //rule 1
vec2 keepAway = vec2(0.0f, 0.0f); //rule 2
vec2 neighborVels = vec2(0.0f, 0.0f); //rule 3

float cnt1 = 0.0f;
float cnt3 = 0.0f;

for (int neighborIndex = 0; neighborIndex < ubo.particleCount; ++neighborIndex)
{
if (neighborIndex == index) continue;

vec2 neighborPos = particlesA[neighborIndex].pos.xy;
vec2 neighborVel = particlesA[neighborIndex].vel.xy;

// Rule 1: boids fly towards their local perceived center of mass, which excludes themselves
if (length(neighborPos - vPos) < ubo.rule1Distance)
{
centerOfMass = centerOfMass + neighborPos;
cnt1 = cnt1 + 1.0f;
}

// Rule 2: boids try to stay a distance d away from each other
if (length(neighborPos - vPos) < ubo.rule2Distance)
keepAway = keepAway - (neighborPos - vPos);

// Rule 3: boids try to match the speed of surrounding boids
if (length(neighborPos - vPos) < ubo.rule3Distance)
{
neighborVels = neighborVels + neighborVel;
cnt3 = cnt3 + 1.0f;
}
}

//calculate averaged parameters
if (cnt1 > 0.0f)
centerOfMass = centerOfMass / cnt1;

centerOfMass = (centerOfMass - vPos) * ubo.rule1Scale;
keepAway = keepAway * ubo.rule2Scale;
neighborVels = neighborVels * ubo.rule3Scale;

vVel = vVel + centerOfMass + keepAway + neighborVels;

// clamp velocity for a more pleasing simulation.
vVel = normalize(vVel) * clamp(length(vVel), 0.0, 0.1);

// kinematic update
vPos += vVel * ubo.deltaT;

// Wrap around boundary
if (vPos.x < -1.0) vPos.x = 1.0;
if (vPos.x > 1.0) vPos.x = -1.0;
if (vPos.y < -1.0) vPos.y = 1.0;
if (vPos.y > 1.0) vPos.y = -1.0;

particlesB[index].pos.xy = vPos;

// Write back
particlesB[index].vel.xy = vVel;
}
3 changes: 1 addition & 2 deletions data/shaders/computeparticles/generate-spirv.bat
Original file line number Diff line number Diff line change
@@ -1,5 +1,4 @@
glslangvalidator -V particle.frag -o particle.frag.spv
glslangvalidator -V particle.vert -o particle.vert.spv
glslangvalidator -V particle.comp -o particle.comp.spv


glslangvalidator -V flockingParticle.comp -o flockingParticle.comp.spv
Binary file added img/2016-11-10-2018-28.flv
Binary file not shown.
Binary file added img/black-hole-boids.flv
Binary file not shown.
Binary file added img/boids.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
39 changes: 30 additions & 9 deletions vulkanBoids/vulkanBoids.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -151,13 +151,14 @@ class VulkanExample : public VulkanExampleBase

std::mt19937 rGenerator;
std::uniform_real_distribution<float> rDistribution(-1.0f, 1.0f);
float vel_scalar = 0.1f;

// Initial particle positions
std::vector<Particle> particleBuffer(PARTICLE_COUNT);
for (auto& particle : particleBuffer)
{
particle.pos = glm::vec2(rDistribution(rGenerator), rDistribution(rGenerator));
// TODO: add randomized velocities with a slight scale here, something like 0.1f.
particle.vel = vel_scalar * glm::vec2(rDistribution(rGenerator), rDistribution(rGenerator));
}

VkDeviceSize storageBufferSize = particleBuffer.size() * sizeof(Particle);
Expand Down Expand Up @@ -244,7 +245,7 @@ class VulkanExample : public VulkanExampleBase
VERTEX_BUFFER_BIND_ID,
1,
VK_FORMAT_R32G32_SFLOAT,
offsetof(Particle, pos)); // TODO: change this so that we can color the particles based on velocity.
offsetof(Particle, vel)); //color the particles based on velocity.

// vertices.inputState encapsulates everything we need for these particular buffers to
// interface with the graphics pipeline.
Expand Down Expand Up @@ -464,7 +465,7 @@ class VulkanExample : public VulkanExampleBase

// Create pipeline on the GPU - load shader, attach the pipeline layout we just made.
VkComputePipelineCreateInfo computePipelineCreateInfo = vkTools::initializers::computePipelineCreateInfo(compute.pipelineLayout, 0);
computePipelineCreateInfo.stage = loadShader(getAssetPath() + "shaders/computeparticles/particle.comp.spv", VK_SHADER_STAGE_COMPUTE_BIT);
computePipelineCreateInfo.stage = loadShader(getAssetPath() + "shaders/computeparticles/flockingParticle.comp.spv", VK_SHADER_STAGE_COMPUTE_BIT);
VK_CHECK_RESULT(vkCreateComputePipelines(device, pipelineCache, 1, &computePipelineCreateInfo, nullptr, &compute.pipeline));

//////// Create command pool and command buffer for compute commands ////////
Expand Down Expand Up @@ -515,9 +516,8 @@ class VulkanExample : public VulkanExampleBase

std::vector<VkWriteDescriptorSet> computeWriteDescriptorSets =
{
// LOOK
// WriteDescriptorSet writes each of these descriptors into the specified descriptorSet.
// THese first few are written into compute.descriptorSet[0].
// These first few are written into compute.descriptorSet[0].
// Each of these corresponds to a layout binding in the descriptor set layout,
// which in turn corresponds with something like `layout(std140, binding = 0)` in `particle.comp`.

Expand All @@ -540,13 +540,32 @@ class VulkanExample : public VulkanExampleBase
compute.descriptorSets[0],
VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER,
2,
&compute.uniformBuffer.descriptor)
&compute.uniformBuffer.descriptor),

// TODO: write the second descriptorSet, using the top for reference.
// We want the descriptorSets to be used for flip-flopping:
// on one frame, we use one descriptorSet with the compute pass,
// on the next frame, we use the other.
// What has to be different about how the second descriptorSet is written here?

// Binding 0 : Particle position storage buffer
vkTools::initializers::writeDescriptorSet(
compute.descriptorSets[1], // which descriptor set to write to?
VK_DESCRIPTOR_TYPE_STORAGE_BUFFER,
0, // which binding in the descriptor set Layout?
&compute.storageBufferB.descriptor), // which SSBO?

// Binding 1 : Particle position storage buffer
vkTools::initializers::writeDescriptorSet(
compute.descriptorSets[1],
VK_DESCRIPTOR_TYPE_STORAGE_BUFFER,
1,
&compute.storageBufferA.descriptor),

// Binding 2 : Uniform buffer
vkTools::initializers::writeDescriptorSet(
compute.descriptorSets[1],
VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER,
2,
&compute.uniformBuffer.descriptor)
};

vkUpdateDescriptorSets(device, static_cast<uint32_t>(computeWriteDescriptorSets.size()), computeWriteDescriptorSets.data(), 0, NULL);
Expand Down Expand Up @@ -583,13 +602,15 @@ class VulkanExample : public VulkanExampleBase
// are done executing.
VK_CHECK_RESULT(vkQueueSubmit(compute.queue, 1, &computeSubmitInfo, compute.fence));

// TODO: handle flip-flop logic. We want the next iteration to
// Handle flip-flop logic. We want the next iteration to
// run the compute pipeline with flipped SSBOs, so we have to
// swap the descriptorSets, which each allow access to the SSBOs
// in one configuration.
// We also want to flip what SSBO we draw with in the next
// pass through the graphics pipeline.
// Feel free to use std::swap here. You should need it twice.
std::swap(compute.descriptorSets[0], compute.descriptorSets[1]);
std::swap(compute.storageBufferA, compute.storageBufferB);
}

// Record command buffers for drawing using the graphics pipeline
Expand Down