Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: add high-precision GPU trilinear interpolation for 3D LUTs #1794

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

cessen
Copy link

@cessen cessen commented Apr 26, 2023

This addresses #1763.

The new high-precision code path can be enabled by disabling the new default-enabled OPTIMIZATION_NATIVE_GPU_TRILINEAR optimization flag.

The existing code path used the GPU's native trilinear texture interpolation function, which (although faster) quantized the lookup coordinates and could cause color banding. That's still the default, but now full-precision trilinear interpolation can optionally be used instead.

WIP

This is PR is WIP. It functions, but it's not ready to merge yet:

  • It wasn't clear how best to get the optimization flags to the shader generation code in GetLut3DGPUShaderProgram(). I tried several approaches, but every approach ended up affecting some public API.
  • The approach I finally landed on was to store the optimization flags in GPUProcessor::Impl, and then pass them to Op::extractGpuShaderInfo() via a new argument with a default value. Of all the approaches I tried, this seemed to have the lowest impact on both the code and the API.
  • Unfortunately, this still breaks ABI compatibility (I think--I'm not totally sure how default arguments work at the ABI level in C++). Additionally, default arguments are apparently forbidden in the OCIO code base, so that's probably not the way forward, regardless.

I am very much open to suggestions for a better approach to get the optimization flags to GetLut3DGPUShaderProgram().

Additionally, I still need to add unit tests for the new code path.

Performance

Some initial naive performance tests indicate that the high precision code is notably slower than than GPU-native trilinear interpolation, but about on par with OCIO's tetrahedral interpolation. More testing is needed, however. For example, using higher-res LUTs and testing on a variety of GPUs. I'll update with actual data once I've had a better go at this.

This new code path can be enabled by disabling the new default-enabled
OPTIMIZATION_NATIVE_GPU_TRILINEAR optimization flag.

The existing code path used the GPU's native trilinear texture
interpolation function, which, although faster, quantized the lookup
coordinates which could cause banding.  That's still the default,
but full-precision trilinear interpolation can optionally be used
instead.

Signed-off-by: Nathan Vegdahl <[email protected]>
@linux-foundation-easycla
Copy link

linux-foundation-easycla bot commented Apr 26, 2023

CLA Signed

The committers listed above are authorized under a signed CLA.

  • ✅ login: cessen / name: Nathan Vegdahl (2fbaf75)

@cessen cessen marked this pull request as draft April 26, 2023 17:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant