-
Notifications
You must be signed in to change notification settings - Fork 429
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Crash Creating Descriptor Pool on ParaVirtualized Device #2373
Comments
Intersting situation. The reason you may not have been encountering this in previous versions of MoltenVK is that MoltenVK version v1.2.11 (SDK 1.3.296) defaults to using Metal Argument Buffers, whereas previous versions did not. To revert, you can set the environment variable If the default setting (argument buffers enabled) is working on devices but not in your CI virtualization environment, there might be a problem with either how your environment reports capabilities, and what is actually supported. With Sonoma and a Tier 2 GPU, your environment should be using the Metal3 style of argument buffers, which do not use argument encoders. But somehow, this is using argument buffer encoders, and then Metal seems to be failing on the internal calls. What type of GPU are you running on? |
Based on OP's description, which indicates an ARM64 architecture, I'd assume Apple Silicon. |
The GPU is the integrated device on the host Apple Silicon M1, but as you point out, the virtualization may be resulting in incorrect results checking capabilities. Will try the environment variable workaround - thank you, @billhollings |
The environment variable workaround prevents the crash on the CI node. Updating to latest MoltenVk is now unblocked for us. For anyone else hitting this issue during iOS simulator integration tests you'll want to add the same environment variable to the xcscheme for your project. e.g.
|
I am hitting what appears to be the same issue for the vkd3d CI, whose macOS runner indeed runs inside a virtual machine, therefore with a paravirtualized device. An example of failing log is https://gitlab.winehq.org/giomasce/vkd3d/-/jobs/115045/artifacts/raw/artifacts/000-c930856/tests/hlsl/abs.log. This was marked as "Question", but it seems there is a real bug here. Not necessarily in MoltenVK, indeed it looks like the bug might be with Apple's driver. Has that been triaged already, and possibly submitted to Apple? If not I might try to do that myself. |
Since this is a tight environmental issue (CI & virtual machines), it's very hard to triage and debug from a general sense. Any further help you could provide in triaging and debugging in your environment would be most helpful. It's also curious to me that the error is in a call to
On an M1 using at least macOS 13 Ventura MoltenVK should be using Metal3 argument buffers, which do not require arguments encoders. I'm wondering if the virtualization environment is somehow interfering with one or more of the In addition, |
It turns out that's pretty easy to reproduce in a virtualized environment. This code is enough to trigger the crash: import Metal
for gpu in MTLCopyAllDevices() {
let desc = MTLArgumentDescriptor()
desc.dataType = .texture
desc.index = 0
let descs = [desc]
let _ = gpu.makeArgumentEncoder(arguments: descs)
} You can test it with Tart, using the commands provided on https://tart.run/quick-start/, and then runnning
So it would seem that the bug was just fixed on Sequoia. Of course that program doesn't test anything else beyond creating an argument buffer encoder, which is very little. Anecdotically, though, I managed to run a few vkd3d tests: there were failures, but failures are expected also on bare metal devices, and I didn't investigate whether there were additional failures which I could attribute to the paravirtualized device. Most of the tests still passed, anyway. |
Vulkan SDK Versions: 1.3.290.0, 1.3.293.0
OS: (uname -a) Darwin vm-osx-sonoma-16-g2-m1.8core-dff65d70-f2cc-4478-9109-1454c98324a3 23.5.0 Darwin Kernel Version 23.5.0: Wed May 1 20:12:39 PDT 2024; root:xnu-10063.121.3~5/RELEASE_ARM64_VMAPPLE arm64
Compiler: XCode 16, Apple clang version 16.0.0
This crash occurs on virtualized macOS (Bitrise CI node) when invoking vkCreateDescriptorPool for either graphics or compute pipelines.
The same code works without issue on iOS, iOS simulator and macOS (non-virtual) without validator warnings.
Issue does not occur in Vulkan SDK 1.3.283.0 using the same compiler and running in the same environment.
Call stack:
Example of crashing invocation:
The text was updated successfully, but these errors were encountered: