Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Mobile NeRF Ray Query Sample #1134

Conversation

RodrigoHolztrattner-QuIC
Copy link
Collaborator

@RodrigoHolztrattner-QuIC RodrigoHolztrattner-QuIC commented Aug 21, 2024

Description

Adds a new sample, Mobile NeRF Ray Query, which is an upgraded version of the original Mobile NeRF sample. This one uses rayqueries to accelerate the NeRF processing.

General Checklist:

Please ensure the following points are checked:

  • My code follows the coding style
  • I have reviewed file licenses
  • I have commented any added functions (in line with Doxygen)
  • I have commented any code that could be hard to understand
  • My changes do not add any new compiler warnings
  • My changes do not add any new validation layer errors or warnings
  • I have used existing framework/helper functions where possible
  • My changes do not add any regressions
  • I have tested every sample to ensure everything runs correctly
  • This PR describes the scope and expected impact of the changes I am making

Note: The Samples CI runs a number of checks including:

  • I have updated the header Copyright to reflect the current year (CI build will fail if Copyright is out of date)
  • My changes build on Windows, Linux, macOS and Android. Otherwise I have documented any exceptions

If this PR contains framework changes:

  • I did a full batch run using the batch command line argument to make sure all samples still work properly

Sample Checklist

If your PR contains a new or modified sample, these further checks must be carried out in addition to the General Checklist:

  • I have tested the sample on at least one compliant Vulkan implementation
  • If the sample is vendor-specific, I have tagged it appropriately
  • I have stated on what implementation the sample has been tested so that others can test on different implementations and platforms
  • Any dependent assets have been merged and published in downstream modules
  • For new samples, I have added a paragraph with a summary to the appropriate chapter in the readme of the folder that the sample belongs to e.g. api samples readme
  • For new samples, I have added a tutorial README.md file to guide users through what they need to know to implement code using this feature. For example, see conditional_rendering
  • For new samples, I have added a link to the Antora navigation so that the sample will be listed at the Vulkan documentation site

Signed-off-by: Rodrigo Holztrattner <[email protected]>
Signed-off-by: Rodrigo Holztrattner <[email protected]>
@RodrigoHolztrattner-QuIC
Copy link
Collaborator Author

Moving sample out from draft (for review).

Note: There are some VUID-vkCmdBuildAccelerationStructuresKHR-pInfos-03710 validation errors on AMD devices when using acceleration structures. The ray_tracing_extended sample seems to trigger this same errors as well. This might be device/driver dependent and not really an issue on more up-to-date HW/drivers.

An old and similar issue raytracing_extended sample fails with a scratch buffer alignment problem · Issue #393 · KhronosGroup/Vulkan-Samples · GitHub was supposedly tracking this problem and closed as resolved, but for me at least it, it is still present.

These errors do not appear on NVIDIA's or mobile devices in general.

@RodrigoHolztrattner-QuIC RodrigoHolztrattner-QuIC marked this pull request as ready for review August 23, 2024 00:34
Signed-off-by: Rodrigo Holztrattner <[email protected]>
@SaschaWillems SaschaWillems self-requested a review August 25, 2024 12:23
Copy link
Collaborator

@SaschaWillems SaschaWillems left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unlike the existing mobile NeRF sample, this one does not work for me. I get this error at startup:

[error] Required device extension VK_KHR_get_physical_device_properties2 not available, cannot run
[error] Error Message: Extensions not present : ERROR_EXTENSION_NOT_PRESENT
[error] Failed when running application Mobile NeRF Ray Query

This is caused, by VK_KHR_get_physical_device_properties2 being added as a device extension, while it actually is an instance level extension.

@SaschaWillems
Copy link
Collaborator

Also mouse camera controls feel odd. Like they're inverted, compared to the other mobile NeRF sample.

@SaschaWillems
Copy link
Collaborator

The sample also trigger a lot of warnings about uninitialized variables (~75). Those need to be fixed too.

@SaschaWillems
Copy link
Collaborator

I also get these validation messages with the latest SDK:

[error] 269944751 - VUID-RuntimeSpirv-NonWritable-06340: Validation Error: [ VUID-RuntimeSpirv-NonWritable-06340 ] Object 0: handle = 0x59f7450000000038, type = VK_OBJECT_TYPE_SHADER_MODULE; | MessageID = 0x101707af | vkCreateGraphicsPipelines(): pCreateInfos[0].pStages[1] SPIR-V (VK_SHADER_STAGE_FRAGMENT_BIT) uses descriptor [Set 2, Binding 0, variable "indices_set"] (type VK_DESCRIPTOR_TYPE_STORAGE_BUFFER or VK_DESCRIPTOR_TYPE_STORAGE_BUFFER_DYNAMIC) which is not marked with NonWritable, but fragmentStoresAndAtomics was not enabled. The Vulkan spec states: If fragmentStoresAndAtomics is not enabled, then all storage image, storage texel buffer, and storage buffer variables in the fragment stage must be decorated with the NonWritable decoration (https://vulkan.lunarg.com/doc/view/1.3.290.0/windows/1.3-extensions/vkspec.html#VUID-RuntimeSpirv-NonWritable-06340)
[error] 269944751 - VUID-RuntimeSpirv-NonWritable-06340: Validation Error: [ VUID-RuntimeSpirv-NonWritable-06340 ] Object 0: handle = 0x59f7450000000038, type = VK_OBJECT_TYPE_SHADER_MODULE; | MessageID = 0x101707af | vkCreateGraphicsPipelines(): pCreateInfos[0].pStages[1] SPIR-V (VK_SHADER_STAGE_FRAGMENT_BIT) uses descriptor [Set 1, Binding 0, variable "vertices_set"] (type VK_DESCRIPTOR_TYPE_STORAGE_BUFFER or VK_DESCRIPTOR_TYPE_STORAGE_BUFFER_DYNAMIC) which is not marked with NonWritable, but fragmentStoresAndAtomics was not enabled. The Vulkan spec states: If fragmentStoresAndAtomics is not enabled, then all storage image, storage texel buffer, and storage buffer variables in the fragment stage must be decorated with the NonWritable decoration (https://vulkan.lunarg.com/doc/view/1.3.290.0/windows/1.3-extensions/vkspec.html#VUID-RuntimeSpirv-NonWritable-06340)

Signed-off-by: Rodrigo Holztrattner <[email protected]>
@RodrigoHolztrattner-QuIC
Copy link
Collaborator Author

Hi @SaschaWillems , I got the updated code from the team and update the PR, thanks for your feedback!

Copy link
Contributor

@asuessenbach asuessenbach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have to admit, I can't follow all that mlp-stuff...
But there are a few minor issues, and the camera movement is not consistent: moving left/right is correct but up/down is inverted.

@RodrigoHolztrattner-QuIC
Copy link
Collaborator Author

Hi @asuessenbach, I just got the updated code from the team and updated the PR, the remaining issues should all have been addressed.

asuessenbach
asuessenbach previously approved these changes Sep 17, 2024
@SaschaWillems SaschaWillems added the sample This is relevant to a sample label Sep 24, 2024
Copy link
Collaborator

@SaschaWillems SaschaWillems left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good looks good (aside from a very minor thing, see my comment), sample runs fine.

But I do get this validation error at startup:

[error] 115483881 - VUID-VkShaderModuleCreateInfo-pCode-08740: Validation Error: [ VUID-VkShaderModuleCreateInfo-pCode-08740 ] | MessageID = 0x6e224e9 | vkCreateShaderModule():  SPIR-V Capability UniformBufferArrayNonUniformIndexing was declared, but one of the following requirements is required (VkPhysicalDeviceVulkan12Features::shaderUniformBufferArrayNonUniformIndexing). The Vulkan spec states: If pCode is a pointer to SPIR-V code, and pCode declares any of the capabilities listed in the SPIR-V Environment appendix, one of the corresponding requirements must be satisfied (https://vulkan.lunarg.com/doc/view/1.3.290.0/windows/1.3-extensions/vkspec.html#VUID-VkShaderModuleCreateInfo-pCode-08740)

Looks like that feature isn't enabled.

@RodrigoHolztrattner-QuIC
Copy link
Collaborator Author

RodrigoHolztrattner-QuIC commented Oct 3, 2024

@SaschaWillems I was trying to resolve the issue below before pushing the changes to fix the validation layer error, at this point I'm starting to think it's a problem with the framework and not actually the sample, but just letting you know the latest version shouldn't have any validation errors anymore.

Gralloc4                com.khronos.vulkan_samples           E  isSupported(1, 1, 59, 1, ...) failed with 7
GraphicBufferAllocator  com.khronos.vulkan_samples           E  Failed to allocate (4 x 4) layerCount 1 format 59 usage b00: 2
AHardwareBuffer         com.khronos.vulkan_samples           E  GraphicBuffer(w=4, h=4, lc=1) failed (Unknown error -2), handle=0x0

@RodrigoHolztrattner-QuIC
Copy link
Collaborator Author

It seems other samples have the errors I mentioned above, it doesn't seem specific to this one.

asuessenbach
asuessenbach previously approved these changes Oct 10, 2024
@SaschaWillems SaschaWillems self-requested a review October 21, 2024 15:33
SaschaWillems
SaschaWillems previously approved these changes Oct 28, 2024
Copy link
Collaborator

@JoseEmilio-ARM JoseEmilio-ARM left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested on Arm GPUs and there seems to be an improvement (with respect to no-RQ) with the USE_OPAQUE flag, but a regression without it. Could USE_OPAQUE be the default?

@SaschaWillems
Copy link
Collaborator

Can we defer minor remarks/questions? This PR has been around for quite some time now and I think we should make sure that PRs as these get merged as fast as possible. Adding in late minor changes would require everyone to re-approve.

@JoseEmilio-ARM
Copy link
Collaborator

In this case the sample goal is to show the performance advantage of using ray query, but this is not the case (in the devices I tested), so it might require major changes, including a better solution for the device_lost issue mentioned in the comments

@RodrigoHolztrattner-QuIC
Copy link
Collaborator Author

RodrigoHolztrattner-QuIC commented Nov 4, 2024

@JoseEmilio-ARM the performance really depends on the number of objects and which device was used for testing, let me cover each topic and the device issue problem:

One or just a few objects are much faster when using the ray-query approach, we simplified the draw calls on this sample to make it easier to consume by doing as few draw commands as possible. Multiple draws, each with its own model, would significantly lower cache misses and consequentially, increase performance, while complicating the sample code.

While testing on our devices, Snapdragon 8 Gen 2 is almost in pair with the non-ray-query version, while Gen 3 and Gen 4 offers significant increased FPS then the previous generations.

For the USE_OPAQUE mode: The original models from the paper have transparent textures, but the ones we made available to the public doesn't. To ensure compatibility with some AMD devices the non-opaque mode was left there just to avoid the possible crash if one decides to use the original models. We don't mind removing the non-opaque code.

For the items where there could be actionable changes, I would prefer to create a new PR for addressing them instead of adding more changes to this one, as it's been active for a while now. I assume performance is the main point, but not only we do get faster performance with ray-query on more recent devices, it's a modern approach and likely to mature even more in the next generations of devices, providing even greater performance for NeRF processing on mobile.

gpx1000
gpx1000 previously approved these changes Nov 4, 2024
Signed-off-by: Rodrigo Holztrattner <[email protected]>
@marty-johnson59
Copy link
Contributor

Merging per discussion today. Thanks everyone!

@marty-johnson59 marty-johnson59 merged commit 96887d8 into KhronosGroup:main Nov 4, 2024
23 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Expedite sample This is relevant to a sample
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants