Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v22.1.0.1 crashes on wgpuQueueSubmit() if wgpuRenderPassEncoderRelease() wasn't called. #412

Open
yig opened this issue Aug 27, 2024 · 12 comments

Comments

@yig
Copy link

yig commented Aug 27, 2024

I just updated to wgpu-native v22.1.0.1. If I don't call wgpuRenderPassEncoderRelease() before wgpuQueueSubmit(), I get an error:

thread '<unnamed>' panicked at /Users/runner/.cargo/git/checkouts/wgpu-53e70f8674b08dd4/5c5c8b1/wgpu-core/src/command/mod.rs:522:14:
CommandBuffer cannot be destroyed because is still in use

The code looks like this:

wgpuRenderPassEncoderEnd( render_pass );
wgpuRenderPassEncoderRelease( render_pass ); // crashes without this line
WGPUCommandBuffer command = wgpuCommandEncoderFinish( encoder, nullptr );
wgpuQueueSubmit( queue, 1, &command );

The relevant part of the backtrace:

   3: core::option::expect_failed
             at /rustc/3f5fd8dd41153bc5fdca9427e9e05be2c767ba23/library/core/src/option.rs:1995:5
   4: wgpu_core::command::CommandBuffer<A>::from_arc_into_baked
   5: wgpu_core::device::queue::<impl wgpu_core::global::Global>::queue_submit
   6: _wgpuQueueSubmit

Here is a runnable fairly minimal crashing example: https://github.com/yig/LearnWebGPU-Code/blob/step030-vanilla/main.cpp#L233

I hacked the CMakeLists.txt to run on macos-aarch64. (It hard-codes the macos-aarch64 release and adds -framework Metal linker flags. Otherwise, it should be easy to modify to reproduce on other platforms.)

git clone [email protected]:yig/LearnWebGPU-Code.git
cd LearnWebGPU-Code
git switch step030-vanilla
cmake -B build
cmake --build build -j 6
./build/App
@rajveermalviya
Copy link
Collaborator

Yeah in previous versions it was incorrectly allowed to release the RenderPass later, but in latest version that bug is fixed. Also RenderPasses are single use only so it doesn't make sense to keep it around after calling wgpuRenderPassEncoderEnd on it.

@yig
Copy link
Author

yig commented Aug 27, 2024

How could that be part of the spec? In JavaScript, I didn't think it was possible to manually release a RenderPass.

@rajveermalviya
Copy link
Collaborator

wgpu[Object]Release() and (the Reference() conterpart) are not in javascript spec but part of the webgpu.h.

@yig
Copy link
Author

yig commented Aug 27, 2024

Where are things like this RenderPass lifetime rule specified? My understanding was that the JS API and the C API should be implementable on top of each other. I don't see how the JS API could be implemented on top of the C API with this lifetime rule.

@rajveermalviya
Copy link
Collaborator

rajveermalviya commented Aug 27, 2024

Seem like this is similar to gfx-rs/wgpu#6145.

@yig
Copy link
Author

yig commented Aug 27, 2024

Yes, this is exactly the same. The extra scope for the render pass shown there serves as a workaround for a ref-counted language (including C++ with RAII). I guess a similar workaround for a garbage collected language could be to add the scope and manually triggering the garbage collector immediately after closing the scope. I hope this can be fixed and these workarounds avoided.

@Vipitis
Copy link

Vipitis commented Aug 28, 2024

I also run into this in pygfx/wgpu-py#547 and just made the private release is part of the public end method.

I am not sure if there is any case where a end call is not followed by release. So it feels somewhat redundant

@almarklein almarklein pinned this issue Sep 2, 2024
@trbabb
Copy link

trbabb commented Oct 14, 2024

I have just run into this myself, and I must say the error message and fix was not at all intuitive.

Is there something I can refer to to understand the lifetime expectations and refcount rules for these objects in C/C++?

@yig
Copy link
Author

yig commented Oct 15, 2024

My understanding is that this is a deviation from the spec.

@hqhs
Copy link

hqhs commented Nov 9, 2024

Just spent a couple of hours debugging this myself. Someday I will learn to check github issues first :)

Yeah in previous versions it was incorrectly allowed to release the RenderPass later, but in latest version that bug is fixed.

@rajveermalviya but surely the crash with an unrelated error message cannot be a desired behavior? I'm not arguing for that to be the only correct way of handling resources, but e.g. dawn doesn't crash and (seems to) release encoder passes by itself.

@yig
Copy link
Author

yig commented Nov 28, 2024

This is a wgpu bug ("deviation from the spec") for render and compute passes. Is there a bug tracker elsewhere that we should use to report this?

MendyBerger added a commit to MendyBerger/wasi-gfx-runtime that referenced this issue Nov 29, 2024
MendyBerger added a commit to MendyBerger/wasi-gfx-runtime that referenced this issue Nov 29, 2024
MendyBerger added a commit to wasi-gfx/wasi-gfx-runtime that referenced this issue Nov 29, 2024
@ErichDonGubler
Copy link
Member

ErichDonGubler commented Dec 17, 2024

@yig:

This is a wgpu bug ("deviation from the spec") for render and compute passes. Is there a bug tracker elsewhere that we should use to report this?

Yep: https://github.com/gfx-rs/wgpu/issues/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

13 participants
@yig @ErichDonGubler @trbabb @hqhs @Vipitis @rajveermalviya and others