Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement push-constants #574

Merged
merged 34 commits into from
Sep 17, 2024
Merged
Show file tree
Hide file tree
Changes from 31 commits
Commits
Show all changes
34 commits
Select commit Hold shift + click to select a range
efcc55b
Initial implementation of push_constants
fyellin Sep 9, 2024
694410f
Merge branch 'main' into push-constant
fyellin Sep 9, 2024
e0bac8c
Initial implementation of push_constants
fyellin Sep 9, 2024
c131d72
Better handling of limits
fyellin Sep 9, 2024
ca16d0b
One more lint error.
fyellin Sep 9, 2024
5976ccd
And one more typo.
fyellin Sep 9, 2024
b1d07d0
Change limits to use hyphens
fyellin Sep 10, 2024
06adb79
Forgot to uncomment some lines
fyellin Sep 10, 2024
330a6a9
Removed a couple of more comments
fyellin Sep 10, 2024
cdc2d3c
Fix typo in comment.
fyellin Sep 10, 2024
e80d7a4
Move push_constants stuff to extras.py
fyellin Sep 10, 2024
1adb413
Fix flake and codegen
fyellin Sep 10, 2024
d425910
Fix failing test
fyellin Sep 10, 2024
d39ffb7
Linux is failing even though my Mac isn't. I have to figure out what…
fyellin Sep 10, 2024
53bc240
And one last lint problem
fyellin Sep 10, 2024
e57cb9f
First pass at documentation.
fyellin Sep 11, 2024
a87b470
Merge branch 'main' into push-constant
fyellin Sep 11, 2024
01603ba
First pass at documentation.
fyellin Sep 11, 2024
e4975d5
Merge with main
fyellin Sep 11, 2024
f42dc36
Undo accidental modification
fyellin Sep 11, 2024
64dcbeb
See
fyellin Sep 11, 2024
d681be2
Merge branch 'main' into push-constant
fyellin Sep 11, 2024
f31aaae
Found one carryover from move to 22.1 that I forgot to include.
fyellin Sep 12, 2024
6776ae4
Yikes. One more _api change
fyellin Sep 12, 2024
1294efa
Yikes. One more _api change
fyellin Sep 12, 2024
4ed2712
Merge branch 'main' into push-constant
fyellin Sep 12, 2024
3ad2868
Apply suggestions from code review
fyellin Sep 13, 2024
2249a4b
Update comments.
fyellin Sep 13, 2024
7a2b3e8
Merge branch 'main' into push-constant
fyellin Sep 13, 2024
39e52b5
Tiny change to get tests to run again.
fyellin Sep 14, 2024
20d2cec
Merge branch 'main' into push-constant
fyellin Sep 15, 2024
087c4be
Apply suggestions from code review
fyellin Sep 16, 2024
9c728c4
Merge branch 'main' into push-constant
fyellin Sep 16, 2024
6eba317
Merge branch 'main' into push-constant
Korijn Sep 17, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
98 changes: 98 additions & 0 deletions docs/backends.rst
Original file line number Diff line number Diff line change
Expand Up @@ -59,6 +59,104 @@ The wgpu_native backend provides a few extra functionalities:
:return: Device
:rtype: wgpu.GPUDevice

The wgpu_native backend provides support for push constants.
fyellin marked this conversation as resolved.
Show resolved Hide resolved
Since WebGPU does not this feature, documentation on its use is hard to find.
fyellin marked this conversation as resolved.
Show resolved Hide resolved
A full explanation of push constants and its use in Vulkan can be found
`here <https://vkguide.dev/docs/chapter-3/push_constants/>`_.
Using push constants in WGPU closely follows the Vulkan model.

The advantage of push constants is that they are typically faster to update than uniform buffers.
Modifications to push constants are included in the command encoder; updating a uniform
buffer involves sending a separate command to the GPU.

The disadvantage of push constants and that their size limit is much smaller. The limit
is guaranteed to be at least 128 bytes, and 256 bytes is typical
fyellin marked this conversation as resolved.
Show resolved Hide resolved

Given an adapter, first determine if it supports push constants::

>> "push-constants" in adapter.features
True

If push constants are supported, determine the maximum number of bytes that can
be allocated for push constants::

>> adapter.limits["max-push-constant-size"]
256

You must tell the adapter to create a device that supports push constants,
and you must tell it the number of bytes of push constants that you are using.
Overestimating is okay::

device = adapter.request_device(
required_features=["push-constants"],
required_limits={"max-push-constant-size": 256},
)

Creating a push constant in your shader code is similar to the way you would create
a uniform buffer.
The fields that are only used in the ``@vertex`` shader should be separated from the fields
that are only used in the ``@fragment`` shader which should be separated from the fields
used in both shaders::

struct PushConstants {
// vertex shader
vertex_transform: vec4x4f,
// fragment shader
fragment_transform: vec4x4f,
// used in both
generic_transform: vec4x4f,
}
var<push_constant> push_constants: PushConstants;

To the pipeline layout for this shader, use
``wgpu.backends.wpgu_native.create_pipeline_layout`` instead of
``device.create_pipelinelayout``. It takes an additional argument,
``push_constant_layouts``, describing
the layout of the push constants. For example, in the above example::

push_constant_layouts = [
{"visibility": ShaderState.VERTEX, "start": 0, "end": 64},
{"visibility": ShaderStage.FRAGMENT, "start": 64, "end": 128},
{"visibility": ShaderState.VERTEX + ShaderStage.FRAGMENT , "start": 128, "end": 192},
],

Finally, you set the value of the push constant by using
``wgpu.backends.wpgu_native.set_push_constants``::

set_push_constants(this_pass, ShaderStage.VERTEX, 0, 64, <64 bytes>)
set_push_constants(this_pass, ShaderStage.FRAGMENT, 64, 128, <64 bytes>)
set_push_constants(this_pass, ShaderStage.VERTEX + ShaderStage.FRAGMENT, 128, 192, <64 bytes>)

Bytes must be set separately for each of the three shader stages. If the push constant has
already been set, on the next use you only need to call ``set_push_constants`` on those
bytes you wish to change.

.. py:function:: wgpu.backends.wpgu_native.create_pipeline_layout(device, *, label="", bind_group_layouts, push_constant_layouts=[])

This method provides the same functionality as :func:`wgpu.GPUDevice.create_pipeline_layout`,
but provides an extra `push_constant_layouts` argument.
When using push constants, this argument is a list of dictionaries, where each item
in the dictionary has three fields: `visibility`, `start`, and `end`.

:param device: The device on which we are creating the pipeline layout
:param label: An optional label
:param bind_group_layouts:
:param push_constant_layouts: Described above.

.. py:function:: wgpu.backends.wgpu_native.set_push_constants(render_pass_encoder, visibility, offset, size_in_bytes, data, data_offset=0)

This function requires that the underlying GPU implement `push_constants`.
These push constants are a buffer of bytes available to the `fragment` and `vertex`
shaders. They are similar to a bound buffer, but the buffer is set using this
function call.

:param render_pass_encoder: The render pass encoder to which we are pushing constants.
:param visibility: The stages (vertex, fragment, or both) to which these constants are visible
:param offset: The offset into the push constants at which the bytes are to be written
:param size_in_bytes: The number of bytes to copy from the ata
:param data: The data to copy to the buffer
:param data_offset: The starting offset in the data at which to begin copying.


The js_webgpu backend
---------------------
Expand Down
164 changes: 164 additions & 0 deletions tests/test_set_constant.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,164 @@
import numpy as np
import pytest

import wgpu.utils
from tests.testutils import can_use_wgpu_lib, run_tests
from wgpu import TextureFormat
from wgpu.backends.wgpu_native.extras import create_pipeline_layout, set_push_constants

if not can_use_wgpu_lib:
pytest.skip("Skipping tests that need the wgpu lib", allow_module_level=True)


"""
This code is an amazingly slow way of adding together two 10-element arrays of 32-bit
integers defined by push constants and store them into an output buffer.

The first number of the addition is purposely pulled using the vertex stage, and the
second number from the fragment stage, so that we can ensure that we are correctly
using stage-separated push constants correctly.

The source code assumes the topology is POINT-LIST, so that each call to vertexMain
corresponds with one call to fragmentMain.
"""
COUNT = 10

SHADER_SOURCE = (
f"""
const COUNT = {COUNT}u;
"""
"""
// Put the results here
@group(0) @binding(0) var<storage, read_write> data: array<u32, COUNT>;

struct PushConstants {
values1: array<u32, COUNT>, // VERTEX constants
values2: array<u32, COUNT>, // FRAGMENT constants
}
var<push_constant> push_constants: PushConstants;

struct VertexOutput {
@location(0) index: u32,
@location(1) value: u32,
@builtin(position) position: vec4f,
}

@vertex
fn vertexMain(
@builtin(vertex_index) index: u32,
) -> VertexOutput {
return VertexOutput(index, push_constants.values1[index], vec4f(0, 0, 0, 1));
}

@fragment
fn fragmentMain(@location(0) index: u32,
@location(1) value: u32
) -> @location(0) vec4f {
data[index] = value + push_constants.values2[index];
return vec4f();
}
"""
)

BIND_GROUP_ENTRIES = [
{"binding": 0, "visibility": "FRAGMENT", "buffer": {"type": "storage"}},
]


def setup_pipeline():
adapter = wgpu.gpu.request_adapter(power_preference="high-performance")
device = adapter.request_device(
required_features=["push-constants"],
required_limits={"max-push-constant-size": 128},
)
output_texture = device.create_texture(
# Actual size is immaterial. Could just be 1x1
size=[128, 128],
format=TextureFormat.rgba8unorm,
usage="RENDER_ATTACHMENT|COPY_SRC",
)
shader = device.create_shader_module(code=SHADER_SOURCE)
bind_group_layout = device.create_bind_group_layout(entries=BIND_GROUP_ENTRIES)
render_pipeline_layout = create_pipeline_layout(
device,
bind_group_layouts=[bind_group_layout],
push_constant_layouts=[
{"visibility": "VERTEX", "start": 0, "end": COUNT * 4},
{"visibility": "FRAGMENT", "start": COUNT * 4, "end": COUNT * 4 * 2},
],
)
pipeline = device.create_render_pipeline(
layout=render_pipeline_layout,
vertex={
"module": shader,
"entry_point": "vertexMain",
},
fragment={
"module": shader,
"entry_point": "fragmentMain",
"targets": [{"format": output_texture.format}],
},
primitive={
"topology": "point-list",
},
)
render_pass_descriptor = {
"color_attachments": [
{
"clear_value": (0, 0, 0, 0), # only first value matters
"load_op": "clear",
"store_op": "store",
"view": output_texture.create_view(),
}
],
}

return device, pipeline, render_pass_descriptor


def test_normal_push_constants():
device, pipeline, render_pass_descriptor = setup_pipeline()
vertex_call_buffer = device.create_buffer(size=COUNT * 4, usage="STORAGE|COPY_SRC")
bind_group = device.create_bind_group(
layout=pipeline.get_bind_group_layout(0),
entries=[
{"binding": 0, "resource": {"buffer": vertex_call_buffer}},
],
)

encoder = device.create_command_encoder()
this_pass = encoder.begin_render_pass(**render_pass_descriptor)
this_pass.set_pipeline(pipeline)
this_pass.set_bind_group(0, bind_group)

buffer = np.random.randint(0, 1_000_000, size=(2 * COUNT), dtype=np.uint32)
set_push_constants(this_pass, "VERTEX", 0, COUNT * 4, buffer)
set_push_constants(this_pass, "FRAGMENT", COUNT * 4, COUNT * 4, buffer, COUNT * 4)
this_pass.draw(COUNT)
this_pass.end()
device.queue.submit([encoder.finish()])
info_view = device.queue.read_buffer(vertex_call_buffer)
result = np.frombuffer(info_view, dtype=np.uint32)
expected_result = buffer[0:COUNT] + buffer[COUNT:]
assert all(result == expected_result)


def test_bad_set_push_constants():
device, pipeline, render_pass_descriptor = setup_pipeline()
encoder = device.create_command_encoder()
this_pass = encoder.begin_render_pass(**render_pass_descriptor)

def zeros(n):
return np.zeros(n, dtype=np.uint32)

with pytest.raises(ValueError):
# Buffer is to short
set_push_constants(this_pass, "VERTEX", 0, COUNT * 4, zeros(COUNT - 1))

with pytest.raises(ValueError):
# Buffer is to short
set_push_constants(this_pass, "VERTEX", 0, COUNT * 4, zeros(COUNT + 1), 8)


if __name__ == "__main__":
run_tests(globals())
34 changes: 32 additions & 2 deletions tests/test_wgpu_native_basics.py
Original file line number Diff line number Diff line change
Expand Up @@ -424,18 +424,48 @@ def test_features_are_legal():
)
# We can also use underscore
assert are_features_wgpu_legal(["push_constants", "vertex_writable_storage"])
# We can also use camel case
assert are_features_wgpu_legal(["PushConstants", "VertexWritableStorage"])


def test_features_are_illegal():
# not camel Case
assert not are_features_wgpu_legal(["pushConstants"])
# writable is misspelled
assert not are_features_wgpu_legal(
["multi-draw-indirect", "vertex-writeable-storage"]
)
assert not are_features_wgpu_legal(["my-made-up-feature"])


def are_limits_wgpu_legal(limits):
"""Returns true if the list of features is legal. Determining whether a specific
set of features is implemented on a particular device would make the tests fragile,
so we only verify that the names are legal feature names."""
adapter = wgpu.gpu.request_adapter(power_preference="high-performance")
try:
adapter.request_device(required_limits=limits)
return True
except RuntimeError as e:
assert "Unsupported features were requested" in str(e)
return True
except KeyError:
return False


def test_limits_are_legal():
# A standard feature. Probably exists
assert are_limits_wgpu_legal({"max-bind-groups": 8})
# Two common extension features
assert are_limits_wgpu_legal({"max-push-constant-size": 128})
# We can also use underscore
assert are_limits_wgpu_legal({"max_bind_groups": 8, "max_push_constant_size": 128})
# We can also use camel case
assert are_limits_wgpu_legal({"maxBindGroups": 8, "maxPushConstantSize": 128})


def test_limits_are_not_legal():
assert not are_limits_wgpu_legal({"max-bind-group": 8})


if __name__ == "__main__":
run_tests(globals())

Expand Down
32 changes: 31 additions & 1 deletion tests_mem/testutils.py
Original file line number Diff line number Diff line change
Expand Up @@ -145,7 +145,37 @@ def ob_name_from_test_func(func):


def create_and_release(create_objects_func):
"""Decorator."""
"""
This wrapper goes around a test that takes a single argument n. That test should
be a generator function that yields a descriptor followed
n different objects corresponding to the name of the test function. Hence
a test named `test_release_foo_bar` would yield a descriptor followed by
n FooBar objects.

The descriptor is a dictionary with three fields, each optional.
fyellin marked this conversation as resolved.
Show resolved Hide resolved

The keys "expected_counts_after_create" and "expected_counts_after_release" each have
as their value a sub-dictionary giving the number of still-alive WGPU objects.
The key "expected_counts_after_create" gives the expected state after the
n objects have been created and put into a list; "expected_counts_after_release"
gives the state after the n objects have been released.

These sub-dictionaries have as their keys the names of WGPU object types, and
their value is a tuple of two integers: the first is the number of Python objects
expected to exist and the second is the number of native objects. Any type not in
the subdictionary has an implied value of (0, 0).

The key "ignore" has as its value a collection of object types that we should ignore
in this test. We do not have enough information to determine how many are created
or deleted.
fyellin marked this conversation as resolved.
Show resolved Hide resolved

If the descriptor doesn't contain an "expected_counts_after_create", then the default
is {"FooBar": (n, n)}, where "FooBar" is derived from the name of the test.

If the descriptor doesn't contain an "expected_counts_after_release", then the
default is {}, indicated that creating and removing the objects should completely
clean itself up.
"""

def core_test_func():
"""The core function that does the testing."""
Expand Down
Loading
Loading