Skip to content

Commit

Permalink
Cluster light probes using conservative spherical bounds.
Browse files Browse the repository at this point in the history
This commit allows the Bevy renderer to use the clustering
infrastructure for light probes (reflection probes and irradiance
volumes) on platforms where at least 3 storage buffers are available. On
such platforms (the vast majority), we stop performing brute-force
searches of light probes for each fragment and instead only search the
light probes with bounding spheres that intersect the current cluster.
This should dramatically improve scalability of irradiance volumes and
reflection probes.

The primary platform that doesn't support 3 storage buffers is WebGL 2,
and we continue using a brute-force search of light probes on that
platform, as the UBO that stores per-cluster indices is too small to fit
the light probe counts. Note, however, that that platform also doesn't
support bindless textures (indeed, it would be very odd for a platform
to support bindless textures but not SSBOs), so we only support one of
each type of light probe per drawcall in the first place. So this isn't
a performance problem, as the search will only have one light probe to
consider. (In fact, clustering would probably end up being a performance
loss.)

Known potential improvements include:

1. We currently cull based on a conservative bounding sphere test and
   not based on the oriented bounding box (OBB) of the light probe. This
   is improvable, but in the interests of simplicity, I opted to keep
   the bounding sphere test for now. The OBB improvement can be a
   follow-up.

2. This patch doesn't change the fact that each fragment only takes a
   single light probe into account. Typical light probe implementations
   detect the case in which multiple light probes cover the current
   fragment and perform some sort of weighted blend between them. As the
   light probe fetch function presently returns only a single light
   probe, implementing that feature would require more code
   restructuring, so I left it out for now. It can be added as a
   follow-up.

3. Light probe implementations typically have a falloff range. Although
   this is a wanted feature in Bevy, this particular commit also doesn't
   implement that feature, as it's out of scope.

4. This commit doesn't raise the maximum number of light probes past its
   current value of 8 for each type. This should be addressed later, but
   would possibly require more bindings on platforms with storage
   buffers, which would increase this patch's complexity. Even without
   raising the limit, this patch should constitute a significant
   performance improvement for scenes that get anywhere close to this
   limit. In the interest of keeping this patch small, I opted to leave
   raising the limit to a follow-up.
  • Loading branch information
pcwalton committed Jun 8, 2024
1 parent d45bcfd commit d623ea3
Show file tree
Hide file tree
Showing 11 changed files with 527 additions and 161 deletions.
16 changes: 15 additions & 1 deletion assets/shaders/irradiance_volume_voxel_visualization.wgsl
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
#import bevy_pbr::forward_io::VertexOutput
#import bevy_pbr::irradiance_volume
#import bevy_pbr::mesh_view_bindings
#import bevy_pbr::clustered_forward

struct VoxelVisualizationIrradianceVolumeInfo {
world_from_voxel: mat4x4<f32>,
Expand All @@ -25,11 +26,24 @@ fn fragment(mesh: VertexOutput) -> @location(0) vec4<f32> {
let stp_rounded = round(stp - 0.5f) + 0.5f;
let rounded_world_pos = (irradiance_volume_info.world_from_voxel * vec4(stp_rounded, 1.0f)).xyz;

// Look up the irradiance volume range in the cluster list.
let view_z = dot(vec4<f32>(
mesh_view_bindings::view.view_from_world[0].z,
mesh_view_bindings::view.view_from_world[1].z,
mesh_view_bindings::view.view_from_world[2].z,
mesh_view_bindings::view.view_from_world[3].z
), mesh.world_position);
let cluster_index = clustered_forward::fragment_cluster_index(mesh.position.xy, view_z, false);
var clusterable_object_index_ranges =
clustered_forward::unpack_clusterable_object_index_ranges(cluster_index);

// `irradiance_volume_light()` multiplies by intensity, so cancel it out.
// If we take intensity into account, the cubes will be way too bright.
let rgb = irradiance_volume::irradiance_volume_light(
mesh.world_position.xyz,
mesh.world_normal) / irradiance_volume_info.intensity;
mesh.world_normal,
&clusterable_object_index_ranges,
) / irradiance_volume_info.intensity;

return vec4<f32>(rgb, 1.0f);
}
314 changes: 226 additions & 88 deletions crates/bevy_pbr/src/cluster/assign.rs

Large diffs are not rendered by default.

95 changes: 60 additions & 35 deletions crates/bevy_pbr/src/cluster/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
use std::num::NonZeroU64;

use assign::ClusterableObjectType;
use bevy_ecs::{
component::Component,
entity::{Entity, EntityHashMap},
Expand All @@ -10,7 +11,7 @@ use bevy_ecs::{
system::{Commands, Query, Res, Resource},
world::{FromWorld, World},
};
use bevy_math::{AspectRatio, UVec2, UVec3, UVec4, Vec3Swizzles as _, Vec4};
use bevy_math::{uvec4, AspectRatio, UVec2, UVec3, UVec4, Vec3Swizzles as _, Vec4};
use bevy_reflect::{std_traits::ReflectDefault, Reflect};
use bevy_render::{
camera::Camera,
Expand All @@ -26,7 +27,7 @@ use bevy_utils::{hashbrown::HashSet, tracing::warn};
pub(crate) use crate::cluster::assign::assign_objects_to_clusters;
use crate::MeshPipeline;

mod assign;
pub(crate) mod assign;

#[cfg(test)]
mod test;
Expand Down Expand Up @@ -125,8 +126,7 @@ pub struct Clusters {
#[derive(Clone, Component, Debug, Default)]
pub struct VisibleClusterableObjects {
pub(crate) entities: Vec<Entity>,
pub point_light_count: usize,
pub spot_light_count: usize,
counts: ClusterableObjectCounts,
}

#[derive(Resource, Default)]
Expand Down Expand Up @@ -178,8 +178,24 @@ pub struct ExtractedClusterConfig {
pub(crate) dimensions: UVec3,
}

/// Stores the number of each type of clusterable object in a single cluster.
///
/// Note that `reflection_probes` and `irradiance_volumes` won't be clustered if
/// fewer than 3 SSBOs are available, which usually means on WebGL 2.
#[derive(Clone, Copy, Default, Debug)]
struct ClusterableObjectCounts {
/// The number of point lights in the cluster.
point_lights: u32,
/// The number of spot lights in the cluster.
spot_lights: u32,
/// The number of reflection probes in the cluster.
reflection_probes: u32,
/// The number of irradiance volumes in the cluster.
irradiance_volumes: u32,
}

enum ExtractedClusterableObjectElement {
ClusterHeader(u32, u32),
ClusterHeader(ClusterableObjectCounts),
ClusterableObjectEntity(Entity),
}

Expand All @@ -201,8 +217,11 @@ struct GpuClusterableObjectIndexListsStorage {

#[derive(ShaderType, Default)]
struct GpuClusterOffsetsAndCountsStorage {
/// The starting offset, followed by the number of point lights, spot
/// lights, reflection probes, and irradiance volumes in each cluster, in
/// that order. The remaining fields are filled with zeroes.
#[size(runtime)]
data: Vec<UVec4>,
data: Vec<[UVec4; 2]>,
}

enum ViewClusterBuffers {
Expand Down Expand Up @@ -488,8 +507,8 @@ impl Default for GpuClusterableObjectsUniform {
#[allow(clippy::too_many_arguments)]
// Sort clusterable objects by:
//
// * point-light vs spot-light, so that we can iterate point lights and spot
// lights in contiguous blocks in the fragment shader,
// * object type, so that we can iterate point lights, spot lights, etc. in
// contiguous blocks in the fragment shader,
//
// * then those with shadows enabled first, so that the index can be used to
// render at most `point_light_shadow_maps_count` point light shadows and
Expand All @@ -499,12 +518,12 @@ impl Default for GpuClusterableObjectsUniform {
// clusterable objects are chosen if the clusterable object count limit is
// exceeded.
pub(crate) fn clusterable_object_order(
(entity_1, shadows_enabled_1, is_spot_light_1): (&Entity, &bool, &bool),
(entity_2, shadows_enabled_2, is_spot_light_2): (&Entity, &bool, &bool),
(entity_1, object_type_1): (&Entity, &ClusterableObjectType),
(entity_2, object_type_2): (&Entity, &ClusterableObjectType),
) -> std::cmp::Ordering {
is_spot_light_1
.cmp(is_spot_light_2) // pointlights before spot lights
.then_with(|| shadows_enabled_2.cmp(shadows_enabled_1)) // shadow casters before non-casters
object_type_1
.ordering()
.cmp(&object_type_2.ordering()) // object type and shadow status
.then_with(|| entity_1.cmp(entity_2)) // stable
}

Expand All @@ -526,8 +545,7 @@ pub fn extract_clusters(
let mut data = Vec::with_capacity(clusters.clusterable_objects.len() + num_entities);
for cluster_objects in &clusters.clusterable_objects {
data.push(ExtractedClusterableObjectElement::ClusterHeader(
cluster_objects.point_light_count as u32,
cluster_objects.spot_light_count as u32,
cluster_objects.counts,
));
for clusterable_entity in &cluster_objects.entities {
data.push(ExtractedClusterableObjectElement::ClusterableObjectEntity(
Expand Down Expand Up @@ -567,16 +585,9 @@ pub fn prepare_clusters(

for record in &extracted_clusters.data {
match record {
ExtractedClusterableObjectElement::ClusterHeader(
point_light_count,
spot_light_count,
) => {
ExtractedClusterableObjectElement::ClusterHeader(counts) => {
let offset = view_clusters_bindings.n_indices();
view_clusters_bindings.push_offset_and_counts(
offset,
*point_light_count as usize,
*spot_light_count as usize,
);
view_clusters_bindings.push_offset_and_counts(offset, counts);
}
ExtractedClusterableObjectElement::ClusterableObjectEntity(entity) => {
if let Some(clusterable_object_index) =
Expand Down Expand Up @@ -637,7 +648,7 @@ impl ViewClusterBindings {
}
}

pub fn push_offset_and_counts(&mut self, offset: usize, point_count: usize, spot_count: usize) {
fn push_offset_and_counts(&mut self, offset: usize, counts: &ClusterableObjectCounts) {
match &mut self.buffers {
ViewClusterBuffers::Uniform {
cluster_offsets_and_counts,
Expand All @@ -649,20 +660,24 @@ impl ViewClusterBindings {
return;
}
let component = self.n_offsets & ((1 << 2) - 1);
let packed = pack_offset_and_counts(offset, point_count, spot_count);
let packed =
pack_offset_and_counts(offset, counts.point_lights, counts.spot_lights);

cluster_offsets_and_counts.get_mut().data[array_index][component] = packed;
}
ViewClusterBuffers::Storage {
cluster_offsets_and_counts,
..
} => {
cluster_offsets_and_counts.get_mut().data.push(UVec4::new(
offset as u32,
point_count as u32,
spot_count as u32,
0,
));
cluster_offsets_and_counts.get_mut().data.push([
uvec4(
offset as u32,
counts.point_lights,
counts.spot_lights,
counts.reflection_probes,
),
uvec4(counts.irradiance_volumes, 0, 0, 0),
]);
}
}

Expand Down Expand Up @@ -788,6 +803,12 @@ impl ViewClusterBuffers {
}
}

// Compresses the offset and counts of point and spot lights so that they fit in
// a UBO.
//
// This function is only used if storage buffers are unavailable on this
// platform: typically, on WebGL 2.
//
// NOTE: With uniform buffer max binding size as 16384 bytes
// that means we can fit 256 clusterable objects in one uniform
// buffer, which means the count can be at most 256 so it
Expand All @@ -800,12 +821,16 @@ impl ViewClusterBuffers {
// the point light count into bits 9-17, and the spot light count into bits 0-8.
// [ 31 .. 18 | 17 .. 9 | 8 .. 0 ]
// [ offset | point light count | spot light count ]
//
// NOTE: This assumes CPU and GPU endianness are the same which is true
// for all common and tested x86/ARM CPUs and AMD/NVIDIA/Intel/Apple/etc GPUs
fn pack_offset_and_counts(offset: usize, point_count: usize, spot_count: usize) -> u32 {
//
// NOTE: On platforms that use this function, we don't cluster light probes, so
// the number of light probes is irrelevant.
fn pack_offset_and_counts(offset: usize, point_count: u32, spot_count: u32) -> u32 {
((offset as u32 & CLUSTER_OFFSET_MASK) << (CLUSTER_COUNT_SIZE * 2))
| (point_count as u32 & CLUSTER_COUNT_MASK) << CLUSTER_COUNT_SIZE
| (spot_count as u32 & CLUSTER_COUNT_MASK)
| (point_count & CLUSTER_COUNT_MASK) << CLUSTER_COUNT_SIZE
| (spot_count & CLUSTER_COUNT_MASK)
}

#[derive(ShaderType)]
Expand Down
34 changes: 30 additions & 4 deletions crates/bevy_pbr/src/light_probe/environment_map.wgsl
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@
#import bevy_pbr::lighting::{
F_Schlick_vec, LayerLightingInput, LightingInput, LAYER_BASE, LAYER_CLEARCOAT
}
#import bevy_pbr::clustered_forward::ClusterableObjectIndexRanges

struct EnvironmentMapLight {
diffuse: vec3<f32>,
Expand All @@ -25,6 +26,7 @@ struct EnvironmentMapRadiances {

fn compute_radiances(
input: ptr<function, LightingInput>,
clusterable_object_index_ranges: ptr<function, ClusterableObjectIndexRanges>,
layer: u32,
world_position: vec3<f32>,
found_diffuse_indirect: bool,
Expand All @@ -37,7 +39,11 @@ fn compute_radiances(
var radiances: EnvironmentMapRadiances;

// Search for a reflection probe that contains the fragment.
var query_result = query_light_probe(world_position, /*is_irradiance_volume=*/ false);
var query_result = query_light_probe(
world_position,
/*is_irradiance_volume=*/ false,
clusterable_object_index_ranges,
);

// If we didn't find a reflection probe, use the view environment map if applicable.
if (query_result.texture_index < 0) {
Expand Down Expand Up @@ -77,6 +83,7 @@ fn compute_radiances(

fn compute_radiances(
input: ptr<function, LightingInput>,
clusterable_object_index_ranges: ptr<function, ClusterableObjectIndexRanges>,
layer: u32,
world_position: vec3<f32>,
found_diffuse_indirect: bool,
Expand Down Expand Up @@ -127,6 +134,7 @@ fn compute_radiances(
fn environment_map_light_clearcoat(
out: ptr<function, EnvironmentMapLight>,
input: ptr<function, LightingInput>,
clusterable_object_index_ranges: ptr<function, ClusterableObjectIndexRanges>,
found_diffuse_indirect: bool,
) {
// Unpack.
Expand All @@ -141,7 +149,12 @@ fn environment_map_light_clearcoat(
let inv_Fc = 1.0 - Fc;

let clearcoat_radiances = compute_radiances(
input, LAYER_CLEARCOAT, world_position, found_diffuse_indirect);
input,
clusterable_object_index_ranges,
LAYER_CLEARCOAT,
world_position,
found_diffuse_indirect,
);

// Composite the clearcoat layer on top of the existing one.
// These formulas are from Filament:
Expand All @@ -154,6 +167,7 @@ fn environment_map_light_clearcoat(

fn environment_map_light(
input: ptr<function, LightingInput>,
clusterable_object_index_ranges: ptr<function, ClusterableObjectIndexRanges>,
found_diffuse_indirect: bool,
) -> EnvironmentMapLight {
// Unpack.
Expand All @@ -166,7 +180,14 @@ fn environment_map_light(

var out: EnvironmentMapLight;

let radiances = compute_radiances(input, LAYER_BASE, world_position, found_diffuse_indirect);
let radiances = compute_radiances(
input,
clusterable_object_index_ranges,
LAYER_BASE,
world_position,
found_diffuse_indirect,
);

if (all(radiances.irradiance == vec3(0.0)) && all(radiances.radiance == vec3(0.0))) {
out.diffuse = vec3(0.0);
out.specular = vec3(0.0);
Expand Down Expand Up @@ -200,7 +221,12 @@ fn environment_map_light(
out.specular = FssEss * radiances.radiance;

#ifdef STANDARD_MATERIAL_CLEARCOAT
environment_map_light_clearcoat(&out, input, found_diffuse_indirect);
environment_map_light_clearcoat(
&out,
input,
clusterable_object_index_ranges,
found_diffuse_indirect,
);
#endif // STANDARD_MATERIAL_CLEARCOAT

return out;
Expand Down
13 changes: 11 additions & 2 deletions crates/bevy_pbr/src/light_probe/irradiance_volume.wgsl
Original file line number Diff line number Diff line change
Expand Up @@ -7,15 +7,24 @@
irradiance_volume_sampler,
light_probes,
};
#import bevy_pbr::clustered_forward::ClusterableObjectIndexRanges

#ifdef IRRADIANCE_VOLUMES_ARE_USABLE

// See:
// https://advances.realtimerendering.com/s2006/Mitchell-ShadingInValvesSourceEngine.pdf
// Slide 28, "Ambient Cube Basis"
fn irradiance_volume_light(world_position: vec3<f32>, N: vec3<f32>) -> vec3<f32> {
fn irradiance_volume_light(
world_position: vec3<f32>,
N: vec3<f32>,
clusterable_object_index_ranges: ptr<function, ClusterableObjectIndexRanges>,
) -> vec3<f32> {
// Search for an irradiance volume that contains the fragment.
let query_result = query_light_probe(world_position, /*is_irradiance_volume=*/ true);
let query_result = query_light_probe(
world_position,
/*is_irradiance_volume=*/ true,
clusterable_object_index_ranges,
);

// If there was no irradiance volume found, bail out.
if (query_result.texture_index < 0) {
Expand Down
Loading

0 comments on commit d623ea3

Please sign in to comment.