Skip to content

Commit

Permalink
Use EntityHashMap<Entity, T> for render world entity storage for bett…
Browse files Browse the repository at this point in the history
…er performance (#9903)

# Objective

- Improve rendering performance, particularly by avoiding the large
system commands costs of using the ECS in the way that the render world
does.

## Solution

- Define `EntityHasher` that calculates a hash from the
`Entity.to_bits()` by `i | (i.wrapping_mul(0x517cc1b727220a95) << 32)`.
`0x517cc1b727220a95` is something like `u64::MAX / N` for N that gives a
value close to π and that works well for hashing. Thanks for @SkiFire13
for the suggestion and to @nicopap for alternative suggestions and
discussion. This approach comes from `rustc-hash` (a.k.a. `FxHasher`)
with some tweaks for the case of hashing an `Entity`. `FxHasher` and
`SeaHasher` were also tested but were significantly slower.
- Define `EntityHashMap` type that uses the `EntityHashser`
- Use `EntityHashMap<Entity, T>` for render world entity storage,
including:
- `RenderMaterialInstances` - contains the `AssetId<M>` of the material
associated with the entity. Also for 2D.
- `RenderMeshInstances` - contains mesh transforms, flags and properties
about mesh entities. Also for 2D.
- `SkinIndices` and `MorphIndices` - contains the skin and morph index
for an entity, respectively
  - `ExtractedSprites`
  - `ExtractedUiNodes`

## Benchmarks

All benchmarks have been conducted on an M1 Max connected to AC power.
The tests are run for 1500 frames. The 1000th frame is captured for
comparison to check for visual regressions. There were none.

### 2D Meshes

`bevymark --benchmark --waves 160 --per-wave 1000 --mode mesh2d`

#### `--ordered-z`

This test spawns the 2D meshes with z incrementing back to front, which
is the ideal arrangement allocation order as it matches the sorted
render order which means lookups have a high cache hit rate.

<img width="1112" alt="Screenshot 2023-09-27 at 07 50 45"
src="https://github.com/bevyengine/bevy/assets/302146/e140bc98-7091-4a3b-8ae1-ab75d16d2ccb">

-39.1% median frame time.

#### Random

This test spawns the 2D meshes with random z. This not only makes the
batching and transparent 2D pass lookups get a lot of cache misses, it
also currently means that the meshes are almost certain to not be
batchable.

<img width="1108" alt="Screenshot 2023-09-27 at 07 51 28"
src="https://github.com/bevyengine/bevy/assets/302146/29c2e813-645a-43ce-982a-55df4bf7d8c4">

-7.2% median frame time.

### 3D Meshes

`many_cubes --benchmark`

<img width="1112" alt="Screenshot 2023-09-27 at 07 51 57"
src="https://github.com/bevyengine/bevy/assets/302146/1a729673-3254-4e2a-9072-55e27c69f0fc">

-7.7% median frame time.

### Sprites

**NOTE: On `main` sprites are using `SparseSet<Entity, T>`!**

`bevymark --benchmark --waves 160 --per-wave 1000 --mode sprite`

#### `--ordered-z`

This test spawns the sprites with z incrementing back to front, which is
the ideal arrangement allocation order as it matches the sorted render
order which means lookups have a high cache hit rate.

<img width="1116" alt="Screenshot 2023-09-27 at 07 52 31"
src="https://github.com/bevyengine/bevy/assets/302146/bc8eab90-e375-4d31-b5cd-f55f6f59ab67">

+13.0% median frame time.

#### Random

This test spawns the sprites with random z. This makes the batching and
transparent 2D pass lookups get a lot of cache misses.

<img width="1109" alt="Screenshot 2023-09-27 at 07 53 01"
src="https://github.com/bevyengine/bevy/assets/302146/22073f5d-99a7-49b0-9584-d3ac3eac3033">

+0.6% median frame time.

### UI

**NOTE: On `main` UI is using `SparseSet<Entity, T>`!**

`many_buttons`

<img width="1111" alt="Screenshot 2023-09-27 at 07 53 26"
src="https://github.com/bevyengine/bevy/assets/302146/66afd56d-cbe4-49e7-8b64-2f28f6043d85">

+15.1% median frame time.

## Alternatives

- Cart originally suggested trying out `SparseSet<Entity, T>` and indeed
that is slightly faster under ideal conditions. However,
`PassHashMap<Entity, T>` has better worst case performance when data is
randomly distributed, rather than in sorted render order, and does not
have the worst case memory usage that `SparseSet`'s dense `Vec<usize>`
that maps from the `Entity` index to sparse index into `Vec<T>`. This
dense `Vec` has to be as large as the largest Entity index used with the
`SparseSet`.
- I also tested `PassHashMap<u32, T>`, intending to use `Entity.index()`
as the key, but this proved to sometimes be slower and mostly no
different.
- The only outstanding approach that has not been implemented and tested
is to _not_ clear the render world of its entities each frame. That has
its own problems, though they could perhaps be solved.
- Performance-wise, if the entities and their component data were not
cleared, then they would incur table moves on spawn, and should not
thereafter, rather just their component data would be overwritten.
Ideally we would have a neat way of either updating data in-place via
`&mut T` queries, or inserting components if not present. This would
likely be quite cumbersome to have to remember to do everywhere, but
perhaps it only needs to be done in the more performance-sensitive
systems.
- The main problem to solve however is that we want to both maintain a
mapping between main world entities and render world entities, be able
to run the render app and world in parallel with the main app and world
for pipelined rendering, and at the same time be able to spawn entities
in the render world in such a way that those Entity ids do not collide
with those spawned in the main world. This is potentially quite
solvable, but could well be a lot of ECS work to do it in a way that
makes sense.

---

## Changelog

- Changed: Component data for entities to be drawn are no longer stored
on entities in the render world. Instead, data is stored in a
`EntityHashMap<Entity, T>` in various resources. This brings significant
performance benefits due to the way the render app clears entities every
frame. Resources of most interest are `RenderMeshInstances` and
`RenderMaterialInstances`, and their 2D counterparts.

## Migration Guide

Previously the render app extracted mesh entities and their component
data from the main world and stored them as entities and components in
the render world. Now they are extracted into essentially
`EntityHashMap<Entity, T>` where `T` are structs containing an
appropriate group of data. This means that while extract set systems
will continue to run extract queries against the main world they will
store their data in hash maps. Also, systems in later sets will either
need to look up entities in the available resources such as
`RenderMeshInstances`, or maintain their own `EntityHashMap<Entity, T>`
for their own data.

Before:
```rust
fn queue_custom(
    material_meshes: Query<(Entity, &MeshTransforms, &Handle<Mesh>), With<InstanceMaterialData>>,
) {
    ...
    for (entity, mesh_transforms, mesh_handle) in &material_meshes {
        ...
    }
}
```

After:
```rust
fn queue_custom(
    render_mesh_instances: Res<RenderMeshInstances>,
    instance_entities: Query<Entity, With<InstanceMaterialData>>,
) {
    ...
    for entity in &instance_entities {
        let Some(mesh_instance) = render_mesh_instances.get(&entity) else { continue; };
        // The mesh handle in `AssetId<Mesh>` form, and the `MeshTransforms` can now
        // be found in `mesh_instance` which is a `RenderMeshInstance`
        ...
    }
}
```

---------

Co-authored-by: robtfm <[email protected]>
  • Loading branch information
superdump and robtfm authored Sep 27, 2023
1 parent 35d3213 commit b6ead2b
Show file tree
Hide file tree
Showing 17 changed files with 584 additions and 324 deletions.
11 changes: 9 additions & 2 deletions crates/bevy_ecs/src/entity/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@ use crate::{
storage::{SparseSetIndex, TableId, TableRow},
};
use serde::{Deserialize, Serialize};
use std::{convert::TryFrom, fmt, mem, sync::atomic::Ordering};
use std::{convert::TryFrom, fmt, hash::Hash, mem, sync::atomic::Ordering};

#[cfg(target_has_atomic = "64")]
use std::sync::atomic::AtomicI64 as AtomicIdCursor;
Expand Down Expand Up @@ -115,12 +115,19 @@ type IdCursor = isize;
/// [`EntityCommands`]: crate::system::EntityCommands
/// [`Query::get`]: crate::system::Query::get
/// [`World`]: crate::world::World
#[derive(Clone, Copy, Eq, Hash, Ord, PartialEq, PartialOrd)]
#[derive(Clone, Copy, Eq, Ord, PartialEq, PartialOrd)]
pub struct Entity {
generation: u32,
index: u32,
}

impl Hash for Entity {
#[inline]
fn hash<H: std::hash::Hasher>(&self, state: &mut H) {
self.to_bits().hash(state);
}
}

pub(crate) enum AllocAtWithoutReplacement {
Exists(EntityLocation),
DidNotExist,
Expand Down
95 changes: 50 additions & 45 deletions crates/bevy_pbr/src/material.rs
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
use crate::{
render, AlphaMode, DrawMesh, DrawPrepass, EnvironmentMapLight, MeshPipeline, MeshPipelineKey,
MeshTransforms, PrepassPipelinePlugin, PrepassPlugin, ScreenSpaceAmbientOcclusionSettings,
PrepassPipelinePlugin, PrepassPlugin, RenderMeshInstances, ScreenSpaceAmbientOcclusionSettings,
SetMeshBindGroup, SetMeshViewBindGroup, Shadow,
};
use bevy_app::{App, Plugin};
Expand All @@ -14,10 +14,7 @@ use bevy_core_pipeline::{
use bevy_derive::{Deref, DerefMut};
use bevy_ecs::{
prelude::*,
system::{
lifetimeless::{Read, SRes},
SystemParamItem,
},
system::{lifetimeless::SRes, SystemParamItem},
};
use bevy_render::{
mesh::{Mesh, MeshVertexBufferLayout},
Expand All @@ -37,7 +34,7 @@ use bevy_render::{
view::{ExtractedView, Msaa, ViewVisibility, VisibleEntities},
Extract, ExtractSchedule, Render, RenderApp, RenderSet,
};
use bevy_utils::{tracing::error, HashMap, HashSet};
use bevy_utils::{tracing::error, EntityHashMap, HashMap, HashSet};
use std::hash::Hash;
use std::marker::PhantomData;

Expand Down Expand Up @@ -190,6 +187,7 @@ where
.add_render_command::<AlphaMask3d, DrawMaterial<M>>()
.init_resource::<ExtractedMaterials<M>>()
.init_resource::<RenderMaterials<M>>()
.init_resource::<RenderMaterialInstances<M>>()
.init_resource::<SpecializedMeshPipelines<MaterialPipeline<M>>>()
.add_systems(
ExtractSchedule,
Expand Down Expand Up @@ -226,26 +224,6 @@ where
}
}

fn extract_material_meshes<M: Material>(
mut commands: Commands,
mut previous_len: Local<usize>,
query: Extract<Query<(Entity, &ViewVisibility, &Handle<M>)>>,
) {
let mut values = Vec::with_capacity(*previous_len);
for (entity, view_visibility, material) in &query {
if view_visibility.get() {
// NOTE: MaterialBindGroupId is inserted here to avoid a table move. Upcoming changes
// to use SparseSet for render world entity storage will do this automatically.
values.push((
entity,
(material.clone_weak(), MaterialBindGroupId::default()),
));
}
}
*previous_len = values.len();
commands.insert_or_spawn_batch(values);
}

/// A key uniquely identifying a specialized [`MaterialPipeline`].
pub struct MaterialPipelineKey<M: Material> {
pub mesh_key: MeshPipelineKey,
Expand Down Expand Up @@ -368,24 +346,53 @@ type DrawMaterial<M> = (
/// Sets the bind group for a given [`Material`] at the configured `I` index.
pub struct SetMaterialBindGroup<M: Material, const I: usize>(PhantomData<M>);
impl<P: PhaseItem, M: Material, const I: usize> RenderCommand<P> for SetMaterialBindGroup<M, I> {
type Param = SRes<RenderMaterials<M>>;
type Param = (SRes<RenderMaterials<M>>, SRes<RenderMaterialInstances<M>>);
type ViewWorldQuery = ();
type ItemWorldQuery = Read<Handle<M>>;
type ItemWorldQuery = ();

#[inline]
fn render<'w>(
_item: &P,
item: &P,
_view: (),
material_handle: &'_ Handle<M>,
materials: SystemParamItem<'w, '_, Self::Param>,
_item_query: (),
(materials, material_instances): SystemParamItem<'w, '_, Self::Param>,
pass: &mut TrackedRenderPass<'w>,
) -> RenderCommandResult {
let material = materials.into_inner().get(&material_handle.id()).unwrap();
let materials = materials.into_inner();
let material_instances = material_instances.into_inner();

let Some(material_asset_id) = material_instances.get(&item.entity()) else {
return RenderCommandResult::Failure;
};
let Some(material) = materials.get(material_asset_id) else {
return RenderCommandResult::Failure;
};
pass.set_bind_group(I, &material.bind_group, &[]);
RenderCommandResult::Success
}
}

#[derive(Resource, Deref, DerefMut)]
pub struct RenderMaterialInstances<M: Material>(EntityHashMap<Entity, AssetId<M>>);

impl<M: Material> Default for RenderMaterialInstances<M> {
fn default() -> Self {
Self(Default::default())
}
}

fn extract_material_meshes<M: Material>(
mut material_instances: ResMut<RenderMaterialInstances<M>>,
query: Extract<Query<(Entity, &ViewVisibility, &Handle<M>)>>,
) {
material_instances.clear();
for (entity, view_visibility, handle) in &query {
if view_visibility.get() {
material_instances.insert(entity, handle.id());
}
}
}

const fn alpha_mode_pipeline_key(alpha_mode: AlphaMode) -> MeshPipelineKey {
match alpha_mode {
// Premultiplied and Add share the same pipeline key
Expand Down Expand Up @@ -424,12 +431,8 @@ pub fn queue_material_meshes<M: Material>(
msaa: Res<Msaa>,
render_meshes: Res<RenderAssets<Mesh>>,
render_materials: Res<RenderMaterials<M>>,
mut material_meshes: Query<(
&Handle<M>,
&mut MaterialBindGroupId,
&Handle<Mesh>,
&MeshTransforms,
)>,
mut render_mesh_instances: ResMut<RenderMeshInstances>,
render_material_instances: Res<RenderMaterialInstances<M>>,
images: Res<RenderAssets<Image>>,
mut views: Query<(
&ExtractedView,
Expand Down Expand Up @@ -493,15 +496,16 @@ pub fn queue_material_meshes<M: Material>(
}
let rangefinder = view.rangefinder3d();
for visible_entity in &visible_entities.entities {
let Ok((material_handle, mut material_bind_group_id, mesh_handle, mesh_transforms)) =
material_meshes.get_mut(*visible_entity)
else {
let Some(material_asset_id) = render_material_instances.get(visible_entity) else {
continue;
};
let Some(mesh_instance) = render_mesh_instances.get_mut(visible_entity) else {
continue;
};
let Some(mesh) = render_meshes.get(mesh_handle) else {
let Some(mesh) = render_meshes.get(mesh_instance.mesh_asset_id) else {
continue;
};
let Some(material) = render_materials.get(&material_handle.id()) else {
let Some(material) = render_materials.get(material_asset_id) else {
continue;
};
let mut mesh_key = view_key;
Expand Down Expand Up @@ -530,9 +534,10 @@ pub fn queue_material_meshes<M: Material>(
}
};

*material_bind_group_id = material.get_bind_group_id();
mesh_instance.material_bind_group_id = material.get_bind_group_id();

let distance = rangefinder.distance_translation(&mesh_transforms.transform.translation)
let distance = rangefinder
.distance_translation(&mesh_instance.transforms.transform.translation)
+ material.properties.depth_bias;
match material.properties.alpha_mode {
AlphaMode::Opaque => {
Expand Down
25 changes: 14 additions & 11 deletions crates/bevy_pbr/src/prepass/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,8 @@ use bevy_utils::tracing::error;
use crate::{
prepare_materials, setup_morph_and_skinning_defs, AlphaMode, DrawMesh, Material,
MaterialPipeline, MaterialPipelineKey, MeshLayouts, MeshPipeline, MeshPipelineKey,
MeshTransforms, RenderMaterials, SetMaterialBindGroup, SetMeshBindGroup,
RenderMaterialInstances, RenderMaterials, RenderMeshInstances, SetMaterialBindGroup,
SetMeshBindGroup,
};

use std::{hash::Hash, marker::PhantomData};
Expand Down Expand Up @@ -758,8 +759,9 @@ pub fn queue_prepass_material_meshes<M: Material>(
pipeline_cache: Res<PipelineCache>,
msaa: Res<Msaa>,
render_meshes: Res<RenderAssets<Mesh>>,
render_mesh_instances: Res<RenderMeshInstances>,
render_materials: Res<RenderMaterials<M>>,
material_meshes: Query<(&Handle<M>, &Handle<Mesh>, &MeshTransforms)>,
render_material_instances: Res<RenderMaterialInstances<M>>,
mut views: Query<(
&ExtractedView,
&VisibleEntities,
Expand Down Expand Up @@ -804,16 +806,16 @@ pub fn queue_prepass_material_meshes<M: Material>(
let rangefinder = view.rangefinder3d();

for visible_entity in &visible_entities.entities {
let Ok((material_handle, mesh_handle, mesh_transforms)) =
material_meshes.get(*visible_entity)
else {
let Some(material_asset_id) = render_material_instances.get(visible_entity) else {
continue;
};

let (Some(material), Some(mesh)) = (
render_materials.get(&material_handle.id()),
render_meshes.get(mesh_handle),
) else {
let Some(mesh_instance) = render_mesh_instances.get(visible_entity) else {
continue;
};
let Some(material) = render_materials.get(material_asset_id) else {
continue;
};
let Some(mesh) = render_meshes.get(mesh_instance.mesh_asset_id) else {
continue;
};

Expand Down Expand Up @@ -849,7 +851,8 @@ pub fn queue_prepass_material_meshes<M: Material>(
}
};

let distance = rangefinder.distance_translation(&mesh_transforms.transform.translation)
let distance = rangefinder
.distance_translation(&mesh_instance.transforms.transform.translation)
+ material.properties.depth_bias;
match alpha_mode {
AlphaMode::Opaque => {
Expand Down
21 changes: 14 additions & 7 deletions crates/bevy_pbr/src/render/light.rs
Original file line number Diff line number Diff line change
Expand Up @@ -3,10 +3,9 @@ use crate::{
CascadeShadowConfig, Cascades, CascadesVisibleEntities, Clusters, CubemapVisibleEntities,
DirectionalLight, DirectionalLightShadowMap, DrawPrepass, EnvironmentMapLight,
GlobalVisiblePointLights, Material, MaterialPipelineKey, MeshPipeline, MeshPipelineKey,
NotShadowCaster, PointLight, PointLightShadowMap, PrepassPipeline, RenderMaterials, SpotLight,
VisiblePointLights,
PointLight, PointLightShadowMap, PrepassPipeline, RenderMaterialInstances, RenderMaterials,
RenderMeshInstances, SpotLight, VisiblePointLights,
};
use bevy_asset::Handle;
use bevy_core_pipeline::core_3d::Transparent3d;
use bevy_ecs::prelude::*;
use bevy_math::{Mat4, UVec3, UVec4, Vec2, Vec3, Vec3Swizzles, Vec4, Vec4Swizzles};
Expand Down Expand Up @@ -1553,9 +1552,10 @@ pub fn prepare_clusters(
pub fn queue_shadows<M: Material>(
shadow_draw_functions: Res<DrawFunctions<Shadow>>,
prepass_pipeline: Res<PrepassPipeline<M>>,
casting_meshes: Query<(&Handle<Mesh>, &Handle<M>), Without<NotShadowCaster>>,
render_meshes: Res<RenderAssets<Mesh>>,
render_mesh_instances: Res<RenderMeshInstances>,
render_materials: Res<RenderMaterials<M>>,
render_material_instances: Res<RenderMaterialInstances<M>>,
mut pipelines: ResMut<SpecializedMeshPipelines<PrepassPipeline<M>>>,
pipeline_cache: Res<PipelineCache>,
view_lights: Query<(Entity, &ViewLightEntities)>,
Expand Down Expand Up @@ -1598,15 +1598,22 @@ pub fn queue_shadows<M: Material>(
// NOTE: Lights with shadow mapping disabled will have no visible entities
// so no meshes will be queued
for entity in visible_entities.iter().copied() {
let Ok((mesh_handle, material_handle)) = casting_meshes.get(entity) else {
let Some(mesh_instance) = render_mesh_instances.get(&entity) else {
continue;
};
let Some(mesh) = render_meshes.get(mesh_handle) else {
if !mesh_instance.shadow_caster {
continue;
}
let Some(material_asset_id) = render_material_instances.get(&entity) else {
continue;
};
let Some(material) = render_materials.get(&material_handle.id()) else {
let Some(material) = render_materials.get(material_asset_id) else {
continue;
};
let Some(mesh) = render_meshes.get(mesh_instance.mesh_asset_id) else {
continue;
};

let mut mesh_key =
MeshPipelineKey::from_primitive_topology(mesh.primitive_topology)
| MeshPipelineKey::DEPTH_PREPASS;
Expand Down
Loading

0 comments on commit b6ead2b

Please sign in to comment.