Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add motion vectors support for animated meshes #9902

Closed
wants to merge 2 commits into from

Conversation

nicopap
Copy link
Contributor

@nicopap nicopap commented Sep 22, 2023

Objective

bevy_motion_vec_anim_3-2023-09-22.mp4

Solution

  • Add DoubleBufferVec, a thin wrapper around two BufferVec
  • Replace the BufferVec used in SkinUniform and MorphUniform by the double buffer.
  • When buffer.clear() instead of clearing the current buffer, swap the buffers and then clear the 2-frames-behind buffer
  • Also double buffer the skinning and morphing index maps, use the last frame's index to index into the previous buffer.
  • If this is the first frame a mesh with skin or morph is rendered, do not render it with motion vectors.
    • This requires splitting the concept of "view that supports motion vectors" and "mesh that support motion vectors".
    • Now, MeshPipelineKey::MOTION_VECTOR_PREPASS indicates that the mesh supports motion vectors
    • For "view" motion vectors, we add an additional key: MeshPipelineKey::VIEW_MOTION_VECTOR_PREPASS. It is technically meaningless outside of the prepass pipeline, but it avoids widening the key, and therefore slow down processing.

The other changes are the shader code for handling the previous values (which is mostly copy/pasting, sadly); And some attempt at copping with the combinatorial explosion of shader variants in mesh_bindings.rs. (we have the following variables: with/out morph targets, with/out skinning with/out motion vectors)

Note to reviewers

mesh_bindings.rs

Following is a description of the mesh_bindings.rs file.

Entry definition

We first define each individual bind group (layout) entries as functions that accept a binding index and an additional parameter for the binding resource when relevant, and return the layout/bind group entry.

Those are the FOOBAR_layout and FOOBAR_entry functions.

Combination functions

Then we have variant_layout and MeshBindGroupBuilder::add_variant.

Those take a ActiveVariant1 (or build one), create an empty buffer (SmallVec) and progressively add the entries based on what flag is active in the ActiveVariant. Then we create the BindGroupLayout or BindGroup (respectively) with all the gathered entries.

MeshBindGroups

MeshBindGroups is a collection of BindGroups.

To add bind groups to it, you would first create a MeshBindGroupBuilder using MeshBindGroups::bind_group_builder. Then, call add_variant for each BindGroup that will be needed for rendering.

MeshBindGroups has two HashMap, one shared that contains bind groups that can be shared between different entities with the same set of shader features (ActiveVariant elements).

The other HashMap is distinct, which contain bind groups that are unique per-mesh. This is currently only relevant for meshes with morph targets. (Morph targets have a texture binding, therefore need to bind to a different individual texture for each individual mesh).

The idea of two hashmaps, one for shared and another for distinct bind groups is taken from #10235.

In bevy, prepare_mesh_bind_group is where variants are added to MeshBindGroups.

MeshLayouts

MeshLayouts is to layout what MeshBindGroups is to bind groups. Here, however, since there is exactly one layout per set of shader feature, we can pre-compute every possible layouts. This is what occurs in MeshLayouts::new, which is called in the FromWorld impl of MeshPipeline. Then, we can just get an existing layout by using MeshLayouts::get, in setup_morph_and_skinning_defs; And MeshLayouts::get_variant in add_variant (to get the corresponding layout for the bind group).

This approach is a shameless clone of the #10156 approach. Unlike with MeshBindGroups, we don't even need to enumerate every features. By using array::from_fn::<ActiveVariant::COUNT>, we automatically get an enumeration of all possible (and more) shader feature combinations.

ActiveVariant

A downside of the bitflags crate is that it conflates carelessly a set of FOO and FOO, it makes it harder to talk about the bitflag struct.

An ActiveVariant is a set of shader features, a feature may be (1) skinning, (2) morph targets, (3) motion vectors, (N) more in the future.

Since a shader is composed of 0-to-N features, a single ActiveVariant represents a single instance of a shader.

MeshPipelineKey::VIEW_MOTION_VECTOR_PREPASS

There are two problems with a double buffer approach:

  1. The first frame, only the current buffer is filled, there is no such thing as the previous buffer page
  2. The buffer is shared across all meshes, this means that adding/removing a mesh makes indexing the previous buffer with the current frame's indices is bogus.

To solve (1) we need to disable motion vectors, not for the whole view, but for individual meshes. Meshes which skinning or morphing shader are enabled, but didn't exist last frame (their previous buffer page is empty) can't have a previous buffer, therefore shouldn't have the binding for the previous buffer set.

This is where we introduce MeshPipelineKey::VIEW_MOTION_VECTOR_PREPASS. It is a distinct value than MeshPipelineKey::MOTION_VECTOR_PREPASS. The view still has access to the motion vector buffer, but the mesh/instance doesn't have a previous buffer to compute the motion vector with. So we need to have two different key flags to handle this.

To solve (2), we make the double buffer generic over the buffer, and we allow to use it for EntityHashMap, now we can keep around the indices for the previous frame as well as the buffer.

Note that we add special logic so that when motion vectors aren't used, we don't double buffer anything, so that no one pays the cost (memory cost, as it has no runtime cost) of the double-buffer when not needing it.

Tricky bits

The code changes outside of mesh_bindings.rs sure seem small (even double_buffer.rs is 62 lines). But it was actually extremely complex to get right. Bind groups that depend on runtime values such as "does the double buffer have a 'previous' page?" Are really tricky to get right. MeshPipelineKey is more or less a "cache" for the world state, but this cache is set in several places. I often got the infamous wgpu error, and had no way to tell where my errors where. I had to patiently add printlns a bit everywhere, glare at the logs and try to think what explained them, compound that with how many rendering functionalities I had to rebase through, it was very difficult.

Merging this PR will probably move the burden of handling this to more people, and we should prioritize making that kind of bugs impossible in the future.

Testing

I used this modified version of shader_prepass.rs to test the change set: https://gist.github.com/nicopap/bf97174f1496df4854909d672eb2cc0f. Note that it is still missing a morph target + skinned mesh (I tested a morph+skin model locally and seemed to work).

I tried to compare how this affects ghosting with TAA, but I failed to reproduce ghosting artifacts on main. The motion vectors do look correct though.

Future work


Migration guide

  • SetMeshBindGroup now requires an additional boolean const parameter, set it to false to keep the old behavior, true if you want it to handle motion vectors.
  • MeshBindGroups's reset method is now clear.
  • MeshLayouts has been revamped. To get bind groups from the layouts, use MeshBindGroups::bind_group_builder, or access manually the layouts.

Changelog

  • Add motion vectors support for animated meshes. Meaning: TAA now works with animations!

Footnotes

  1. ActiveVariant is a "set of shader features", see relevant section

@nicopap nicopap added C-Feature A new feature, making something new possible A-Rendering Drawing game state to the screen A-Animation Make things move and change over time M-Needs-Migration-Guide A breaking change to Bevy's public API that needs to be noted in a migration guide C-Bug An unexpected or incorrect behavior labels Sep 22, 2023
@nicopap nicopap force-pushed the animated_motion_vectors branch 7 times, most recently from 33d1c9d to ba6ec25 Compare September 23, 2023 09:24
@nicopap nicopap marked this pull request as ready for review September 23, 2023 09:28
@nicopap nicopap added this to the 0.12 milestone Sep 23, 2023
Copy link
Contributor

@JMS55 JMS55 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think after this + #8974 we're good to move TAA out of the "experimental" module, thanks so much for doing this!

crates/bevy_pbr/src/render/mesh_bindings.rs Outdated Show resolved Hide resolved
@nicopap nicopap force-pushed the animated_motion_vectors branch 2 times, most recently from 99ec6f7 to 6f71fd2 Compare October 2, 2023 09:02
Copy link
Contributor

@superdump superdump left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not super keen on the complexity of the binding stuff, but it's too late in the day for me to try to see if I can think of something I would like more. :)

crates/bevy_pbr/src/render/mesh.rs Outdated Show resolved Hide resolved
crates/bevy_pbr/src/render/morph.wgsl Outdated Show resolved Hide resolved
crates/bevy_pbr/src/render/morph.wgsl Outdated Show resolved Hide resolved
crates/bevy_pbr/src/render/morph.wgsl Show resolved Hide resolved
@nicopap nicopap force-pushed the animated_motion_vectors branch from 9e28cbd to 52892b6 Compare October 9, 2023 05:24
@nicopap
Copy link
Contributor Author

nicopap commented Oct 9, 2023

I'm not super keen on the complexity of the binding stuff, but it's too late in the day for me to try to see if I can think of something I would like more. :)

I'm pretty happy with what I came up with. If it's not a global minima in term of complexity, it's definitively a local minima. It's extremely shallow, it's basically function calls (a single layer of call stack, at it) and nothing else… I would recommend looking at the file without the diff, as the deleted code just makes things less clear. I'll add some comments to the code to make it easier to read.

@nicopap nicopap force-pushed the animated_motion_vectors branch 7 times, most recently from 5f713d5 to 4ad75fe Compare October 23, 2023 13:48
@nicopap nicopap marked this pull request as ready for review October 23, 2023 13:49
@nicopap nicopap requested a review from superdump October 23, 2023 13:49
@JMS55
Copy link
Contributor

JMS55 commented Oct 24, 2023

I think we should move this to 0.13, and give it more time instead of trying to rush it through, and then test it with the other TAA changes for 0.13.

@alice-i-cecile alice-i-cecile modified the milestones: 0.12, 0.13 Oct 24, 2023
@nicopap
Copy link
Contributor Author

nicopap commented Oct 25, 2023

For the record, I'm for moving this to 0.13. I think it makes the most change in the context of other TAA changes, which will be moved as well.

@nicopap nicopap force-pushed the animated_motion_vectors branch from ccf0911 to 075762e Compare October 25, 2023 09:09
@nicopap nicopap force-pushed the animated_motion_vectors branch from 075762e to f01ce08 Compare November 5, 2023 13:02
Copy link
Contributor

@superdump superdump left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please follow the style of the rest of bevy with newlines between type and impl, and between bare/impl functions.

let weight_count = bevy_pbr::morph::layer_count();
for (var i: u32 = 0u; i < weight_count; i++) {
let weight = bevy_pbr::morph::previous_weight_at(i);
if weight != 0.0 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't new, so not blocking, but I wonder if this branch is helpful or harmful. It probably has to run both branches regardless I would think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a really good question. My GPU-fu is not strong enough to be able to answer it. I don't remember the motivation behind it.

/// access the current buffer with [`DoubleBuffer::current`],
/// and the previous one with [`DoubleBuffer::previous`].
#[derive(Default)]
pub struct DoubleBuffer<T> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a big deal for now, but I'd like to have some MultiBuffer<const N: usize, T> where N is the number of buffers that get rotated through. It is apparently common to use 3-4 buffers of dynamic per-frame data like mesh uniforms. This is because the GPU can have 2-3 frames in flight, meaning the rendering depends on the contents of those buffers, and if we want to update the contents, wgpu would have to wait until the target buffer is no longer in use by the GPU to be able to transfer the data to it. We can take that in a separate PR though, and just make DoubleBuffer<T> into MultiBuffer<2, T>.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds like a fun thing to implement. But I think it should be part of a different PR.

/// buffer, including the motion vector buffer.
///
/// Used in the prepass pipeline.
const VIEW_MOTION_VECTOR_PREPASS = (1 << 13);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want / need this anymore? I thought the changes to MOTION_VECTOR_PREPASS were meant to indicate when a motion vector prepass is being/has been run and so motion vectors are present.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is still a distinction between "Does this view have a motion vector buffer" and "does this mesh support motion vectors" for the specific mesh being rendered.

if is_motion_vectors {
// disable is_motion_vectors if this mesh doesn't have a previous value for
// skins or morphs. This way, we get the shader variant without the bindings for
// the previous value, which doesn't exist.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this be simplified with a dummy binding? That is, we bind a fallback buffer (like the fallback images), but we don't use it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be great. Though, at the cost of pushing the "is motion vector" info to a uniform, add the is_motion_vector flag to the uniform for each mesh, enable that uniform conditionally on MOTION_VECTOR_PREPASS And not read the motion vector specific buffers when the bit is off.

This would allow getting rid of the combinatorial shader variants too, as we could set unconditionally the previous_joint_matrices/previous_morph_weight buffers.

Though no idea how much work & code that is.

Comment on lines +153 to +154
if let Some(skin) = previous_skin {
bind_group_entries.push(skinning_entry(4, skin));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if let Some(skin) = previous_skin {
bind_group_entries.push(skinning_entry(4, skin));
if let Some(previous_skin) = previous_skin {
bind_group_entries.push(skinning_entry(4, previous_skin));

ShaderStages, TextureSampleType, TextureViewDimension,
// --- Individual layout entries and individual bind group entries ---

fn buffer_layout(binding: u32, size: u64, visibility: ShaderStages) -> BindGroupLayoutEntry {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think @IceSentry is doing something similar to simplify construction of bind group layout entries and bind group entries.

previous_skin: Option<&Buffer>,
previous_weights: Option<&Buffer>,
) {
let mut bind_group_entries = SmallVec::<[BindGroupEntry; 6]>::new();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what benefit SmallVec brings here other than overhead / an unnecessary dependency as this code would not exceed the array on the stack. I'd just use an array and an index variable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would have personally used ArrayVec from the arrayvec crate, but SmallVec is already a dependency of bevy_pbr, so why not use it?

The advantage compared to using an array and subslicing it is that we don't have to initialize zero-valued BindGroupEntry (BindGroupEntry does not implement Default, so it would require a lot of boilerplate code). We could use [MaybeUninit<BindGroupEntry>; 6], but it would also require a bit of boilerplate and unsafe unrelated to what we want to accomplish, and that's pretty much what SmallVec already does for us regardless.

Comment on lines +257 to +258
let layout = |bits| variant_layout(ActiveVariant::from_bits_truncate(bits), device);
MeshLayouts(array::from_fn(layout))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice. :)

let layout = |bits| variant_layout(ActiveVariant::from_bits_truncate(bits), device);
MeshLayouts(array::from_fn(layout))
}
// TODO: Passing 3 bools is pretty bad
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just pass an ActiveVariant instead? .get(ActiveVariant::SKIN | ActiveVariant::MOTION_VECTORS) reads fine to me.

Oh, you have that just below. Then, is this function needed?

/// access the current buffer with [`DoubleBuffer::current`],
/// and the previous one with [`DoubleBuffer::previous`].
#[derive(Default)]
pub struct DoubleBuffer<T> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably be in bevy_render.

@JMS55
Copy link
Contributor

JMS55 commented Nov 29, 2023

Needs rebasing, please

@nicopap
Copy link
Contributor Author

nicopap commented Nov 29, 2023

It isn't worthwhile investing more time in this if I don't know what the way forward to get it merged is.

@alice-i-cecile alice-i-cecile added the S-Adopt-Me The original PR author has no intent to complete this work. Pick me up! label Jan 24, 2024
@alice-i-cecile alice-i-cecile removed this from the 0.13 milestone Jan 24, 2024
pcwalton added a commit to pcwalton/bevy that referenced this pull request May 30, 2024
morph targets.

This is a revamped version of bevyengine#9902. Instead of adding more bind group
layouts as that patch did, which created a combinatorial explosion of
layouts, this patch unconditionally adds `prev_joint_matrices` and
`prev_morph_weights` bindings to the shader bind groups. These add no
significant overhead if unused because we simply bind dummy buffers, and
the driver strips them out if unused. We already do this extensively
with the various `StandardMaterial` bindings as well as the mesh view
bindings, so this approach isn't anything new.

The overall technique consists of double-buffering the joint matrix and
morph weights buffers, as most of the previous attempts to solve this
problem did. The process is generally straightforward. Note that, to
avoid regressing the ability of mesh extraction, skin extraction, and
morph target extraction to run in parallel, I had to add a new system to
rendering, `set_mesh_motion_vector_flags`. The comment there explains
the details; it generally runs very quickly.

I've tested this with modified versions of the `animated_fox`,
`morph_targets`, and `many_foxes` examples that add TAA, and the patch
works. To avoid bloating those examples, I didn't add switches for TAA
to them.

Addresses points (1) and (2) of bevyengine#8423.
pcwalton added a commit to pcwalton/bevy that referenced this pull request May 30, 2024
morph targets.

This is a revamped version of bevyengine#9902. Instead of adding more bind group
layouts as that patch did, which created a combinatorial explosion of
layouts, this patch unconditionally adds `prev_joint_matrices` and
`prev_morph_weights` bindings to the shader bind groups. These add no
significant overhead if unused because we simply bind dummy buffers, and
the driver strips them out if unused. We already do this extensively
with the various `StandardMaterial` bindings as well as the mesh view
bindings, so this approach isn't anything new.

The overall technique consists of double-buffering the joint matrix and
morph weights buffers, as most of the previous attempts to solve this
problem did. The process is generally straightforward. Note that, to
avoid regressing the ability of mesh extraction, skin extraction, and
morph target extraction to run in parallel, I had to add a new system to
rendering, `set_mesh_motion_vector_flags`. The comment there explains
the details; it generally runs very quickly.

I've tested this with modified versions of the `animated_fox`,
`morph_targets`, and `many_foxes` examples that add TAA, and the patch
works. To avoid bloating those examples, I didn't add switches for TAA
to them.

Addresses points (1) and (2) of bevyengine#8423.
@alice-i-cecile
Copy link
Member

#13572 is being merged, which tackles this same work.

github-merge-queue bot pushed a commit that referenced this pull request May 31, 2024
…orph targets. (#13572)

This is a revamped equivalent to #9902, though it shares none of the
code. It handles all special cases that I've tested correctly.

The overall technique consists of double-buffering the joint matrix and
morph weights buffers, as most of the previous attempts to solve this
problem did. The process is generally straightforward. Note that, to
avoid regressing the ability of mesh extraction, skin extraction, and
morph target extraction to run in parallel, I had to add a new system to
rendering, `set_mesh_motion_vector_flags`. The comment there explains
the details; it generally runs very quickly.

I've tested this with modified versions of the `animated_fox`,
`morph_targets`, and `many_foxes` examples that add TAA, and the patch
works. To avoid bloating those examples, I didn't add switches for TAA
to them.

Addresses points (1) and (2) of #8423.

## Changelog

### Fixed

* Motion vectors, and therefore TAA, are now supported for meshes with
skins and/or morph targets.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-Animation Make things move and change over time A-Rendering Drawing game state to the screen C-Bug An unexpected or incorrect behavior C-Feature A new feature, making something new possible M-Needs-Migration-Guide A breaking change to Bevy's public API that needs to be noted in a migration guide S-Adopt-Me The original PR author has no intent to complete this work. Pick me up!
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

6 participants