Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

multi_draw_indirect_counter #594

Draft
wants to merge 27 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from 4 commits
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
ee28841
multi_draw_indirect_counter
fyellin Sep 20, 2024
53cccea
Add documentation.
fyellin Sep 20, 2024
021241e
Bring back the bug. I accidentally indented a line a code and it ende…
fyellin Sep 20, 2024
2bd5408
Update to show everything reported by bug report. Note that "indexed…
fyellin Sep 20, 2024
7559673
Merge branch 'main' into mdi_count
fyellin Sep 22, 2024
62b1cff
Merge branch 'main' into mdi_count
fyellin Sep 24, 2024
b6565e7
Add blank line where needed.
fyellin Sep 24, 2024
f5f0b60
Update backends.rst
fyellin Sep 25, 2024
55603b8
Merge branch 'main' into mdi_count
fyellin Sep 26, 2024
611ed30
Merge branch 'main' into mdi_count
fyellin Sep 29, 2024
7f2364c
Merge branch 'main' into mdi_count
fyellin Oct 1, 2024
7f3cf38
Merge remote-tracking branch 'fyellin/mdi_count' into mdi_count
fyellin Oct 1, 2024
a91ca7e
Merge branch 'main' into mdi_count
fyellin Oct 1, 2024
6ba3a21
Merge branch 'main' into mdi_count
fyellin Oct 2, 2024
cde77ea
Merge branch 'main' into mdi_count
fyellin Oct 3, 2024
992b183
Merge branch 'main' into mdi_count
fyellin Oct 7, 2024
7713ae1
Merge branch 'main' into mdi_count
fyellin Oct 8, 2024
473c8a1
Merge branch 'main' into mdi_count
fyellin Oct 8, 2024
c9455da
Merge branch 'main' into mdi_count
fyellin Oct 17, 2024
28daa0b
Merge branch 'main' into mdi_count
fyellin Oct 22, 2024
30dde9b
Merge tag 'v0.19.0' into mdi_count
fyellin Nov 4, 2024
8c14145
Merge branch 'main' into mdi_count
fyellin Nov 5, 2024
2843620
Merge branch 'main' into mdi_count
fyellin Nov 8, 2024
719d32a
Fix codegen
fyellin Nov 8, 2024
a464d6b
Merge branch 'main' into mdi_count
fyellin Nov 23, 2024
f32a6d7
Merge branch 'main' into mdi_count
fyellin Dec 11, 2024
b8e1a78
Update codegen
fyellin Dec 11, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
53 changes: 49 additions & 4 deletions docs/backends.rst
Original file line number Diff line number Diff line change
Expand Up @@ -159,21 +159,25 @@ bytes you wish to change.
:param data_offset: The starting offset in the data at which to begin copying.


There are two functions that allow you to perform multiple draw calls at once.
Both require that you enable the feature "multi-draw-indirect".
There are four functions that allow you to perform multiple draw calls at once.
Two take the number of draws to perform as an argument; two have this value in a buffer.

Typically, these calls do not reduce work or increase parallelism on the GPU. Rather
they reduce driver overhead on the CPU.

The first two require that you enable the feature ``"multi-draw-indirect"``.

.. py:function:: wgpu.backends.wgpu_native.multi_draw_indirect(render_pass_encoder, buffer, *, offset=0, count):

Equivalent to::
for i in range(count):
render_pass_encoder.draw_indirect(buffer, offset + i * 16)

:param render_pass_encoder: The current render pass encoder.
:param buffer: The indirect buffer containing the arguments.
:param buffer: The indirect buffer containing the arguments. Must have length
at least offset + 16 * count.
:param offset: The byte offset in the indirect buffer containing the first argument.
Must be a multiple of 4.
:param count: The number of draw operations to perform.

.. py:function:: wgpu.backends.wgpu_native.multi_draw_indexed_indirect(render_pass_encoder, buffer, *, offset=0, count):
Expand All @@ -184,10 +188,51 @@ they reduce driver overhead on the CPU.


:param render_pass_encoder: The current render pass encoder.
:param buffer: The indirect buffer containing the arguments.
:param buffer: The indirect buffer containing the arguments. Must have length
at least offset + 20 * count.
:param offset: The byte offset in the indirect buffer containing the first argument.
Must be a multiple of 4.
:param count: The number of draw operations to perform.

The second two require that you enable the feature ``"multi-draw-indirect-count"``.
They are identical to the previous two, except that the ``count`` argument is replaced by
three arguments. The value at ``count_buffer_offset`` in ``count_buffer`` is treated as
an unsigned 32-bit integer. The ``count`` is the minimum of this value and ``max_count``.

.. py:function:: wgpu.backends.wgpu_native.multi_draw_indirect_count(render_pass_encoder, buffer, *, offset=0, count_buffer, count_offset=0, max_count):

Equivalent to::
count = min(<u32 at count_buffer_offset in count_buffer>, max_count)
for i in range(count):
render_pass_encoder.draw_indirect(buffer, offset + i * 16)
fyellin marked this conversation as resolved.
Show resolved Hide resolved

:param render_pass_encoder: The current render pass encoder.
:param buffer: The indirect buffer containing the arguments. Must have length
at least offset + 16 * max_count.
:param offset: The byte offset in the indirect buffer containing the first argument.
Must be a multiple of 4.
:param count_buffer: The indirect buffer containing the count.
:param count_buffer_offset: The offset into count_buffer.
Must be a multiple of 4.
:param max_count: The maximum number of draw operations to perform.

.. py:function:: wgpu.backends.wgpu_native.multi_draw_indexed_indirect_count(render_pass_encoder, buffer, *, offset=0, count_buffer, count_offset=0, max_count):

Equivalent to::
count = min(<u32 at count_buffer_offset in count_buffer>, max_count)
for i in range(count):
render_pass_encoder.draw_indexed_indirect(buffer, offset + i * 2-)

:param render_pass_encoder: The current render pass encoder.
:param buffer: The indirect buffer containing the arguments. Must have length
at least offset + 20 * max_count.
:param offset: The byte offset in the indirect buffer containing the first argument.
Must be a multiple of 4.
:param count_buffer: The indirect buffer containing the count.
:param count_buffer_offset: The offset into count_buffer.
Must be a multiple of 4.
:param max_count: The maximum number of draw operations to perform.


The js_webgpu backend
---------------------
Expand Down
91 changes: 84 additions & 7 deletions tests/test_wgpu_vertex_instance.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,16 +8,18 @@
from wgpu.backends.wgpu_native.extras import (
multi_draw_indexed_indirect,
multi_draw_indirect,
multi_draw_indirect_count,
multi_draw_indexed_indirect_count,
)

MAX_INFO = 100
MAX_INFO = 1000

if not can_use_wgpu_lib:
pytest.skip("Skipping tests that need the wgpu lib", allow_module_level=True)


"""
The fundamental informartion about any of the many draw commands is the
The fundamental information about any of the many draw commands is the
<vertex_instance, instance_index> pair that is passed to the vertex shader. By using
point-list topology, each call to the vertex shader turns into a single call to the
fragment shader, where the pair is recorded.
Expand Down Expand Up @@ -68,7 +70,7 @@

class Runner:
REQUIRED_FEATURES = ["indirect-first-instance"]
OPTIONAL_FEATURES = ["multi-draw-indirect"] # we'll be adding more
OPTIONAL_FEATURES = ["multi-draw-indirect", "multi-draw-indirect-count"]

@classmethod
def is_usable(cls):
Expand All @@ -82,6 +84,7 @@ def __init__(self):
*[x for x in self.OPTIONAL_FEATURES if x in adapter.features],
]
self.device = adapter.request_device(required_features=features)

self.output_texture = self.device.create_texture(
# Actual size is immaterial. Could just be 1x1
size=[128, 128],
Expand Down Expand Up @@ -163,11 +166,39 @@ def __init__(self):
# We're going to want to try calling these draw functions from a buffer, and it
# would be nice to test that these buffers have an offset
self.draw_data_buffer = self.device.create_buffer_with_data(
data=np.uint32([0, 0, *self.draw_args1, *self.draw_args2]),
usage="INDIRECT",
# The zeros at the beginning are to test "offset".
# The zeros at the end are because the _count methods require to buffer to
# be at least byte_offset + 16 * max_count bytes long
data=np.uint32([0, 0, *self.draw_args1, *self.draw_args2, *([0] * 50)]),
usage="INDIRECT", # copy dst for patching
)
self.draw_data_buffer_indexed = self.device.create_buffer_with_data(
data=np.uint32([0, 0, *self.draw_indexed_args1, *self.draw_indexed_args2]),
# The zeros at the beginning are to test "offset".
# The zeros at the end are because the _count methods require to buffer to
# be at least byte_offset + 20 * max_count bytes long
data=np.uint32(
[0, 0, *self.draw_indexed_args1, *self.draw_indexed_args2, *([0] * 50)]
),
usage="INDIRECT",
)

self.count_buffer = self.device.create_buffer_with_data(
data=(np.int32([10, 2])), usage="INDIRECT"
)
self.draw_data_buffer_patched = self.device.create_buffer_with_data(
# The zeros at the beginning are to test the "offset".
# The zeros at the end are because the _count methods require to buffer to
# be at least byte_offset + 16 * max_count bytes long
data=np.uint32([10, 2, *self.draw_args1, *self.draw_args2, *([0] * 50)]),
usage="INDIRECT", # copy dst for patching
)
self.draw_data_buffer_indexed_patched = self.device.create_buffer_with_data(
# The zeros at the beginning are to test "offset".
# The zeros at the end are because the _count methods require to buffer to
# be at least byte_offset + 20 * max_count bytes long
data=np.uint32(
[10, 2, *self.draw_indexed_args1, *self.draw_indexed_args2, *([0] * 50)]
),
usage="INDIRECT",
)

Expand Down Expand Up @@ -211,7 +242,8 @@ def run_draw_test(self, draw_function, indexed, *, expected_result=None):
expected_result = self.expected_result_draw_indexed
else:
expected_result = self.expected_result_draw
assert info_set == expected_result
if info_set != expected_result:
pytest.fail(f"Expected {sorted(expected_result)}\nGot {sorted(info_set)}")


if not Runner.is_usable():
Expand Down Expand Up @@ -337,5 +369,50 @@ def draw(encoder):
)


@pytest.mark.parametrize("bug_patch", [False, True])
@pytest.mark.parametrize("indexed", [False, True])
@pytest.mark.parametrize("test_max_count", [False, True])
def test_multi_draw_indirect_count(runner, test_max_count, indexed, bug_patch):
if "multi-draw-indirect-count" not in runner.device.features:
pytest.skip("Must have 'multi-draw-indirect-count' to run")

print(f"{bug_patch=}, {indexed=}, {test_max_count=} \n")

if indexed:
function = multi_draw_indexed_indirect_count
if not bug_patch:
buffer = runner.draw_data_buffer_indexed
else:
buffer = runner.draw_data_buffer_indexed_patched
else:
function = multi_draw_indirect_count
if not bug_patch:
buffer = runner.draw_data_buffer
else:
buffer = runner.draw_data_buffer_patched

# Either way, we're going to do 2 draws. But one via the max_count and one via the
# information in the buffer.
if test_max_count:
# We pull a count of 10, but we're limiting it to 2 via max_count
count_buffer_offset, max_count = 0, 2
else:
# We pull a count of 2, and set the max_count to something bigger. Buffer
# is required to be big enough to handle max_count.
count_buffer_offset, max_count = 4, 10

def draw(encoder):
function(
encoder,
buffer,
offset=8,
count_buffer=runner.count_buffer,
count_buffer_offset=count_buffer_offset,
max_count=max_count,
)

runner.run_draw_test(draw, indexed)


if __name__ == "__main__":
run_tests(globals())
26 changes: 26 additions & 0 deletions wgpu/backends/wgpu_native/_api.py
Original file line number Diff line number Diff line change
Expand Up @@ -3031,6 +3031,32 @@ def _multi_draw_indexed_indirect(self, buffer, offset, count):
self._internal, buffer._internal, int(offset), int(count)
)

def _multi_draw_indirect_count(
self, buffer, offset, count_buffer, count_buffer_offset, max_count
):
# H: void f(WGPURenderPassEncoder encoder, WGPUBuffer buffer, uint64_t offset, WGPUBuffer count_buffer, uint64_t count_buffer_offset, uint32_t max_count)
libf.wgpuRenderPassEncoderMultiDrawIndirectCount(
self._internal,
buffer._internal,
int(offset),
count_buffer._internal,
int(count_buffer_offset),
int(max_count),
)

def _multi_draw_indexed_indirect_count(
self, buffer, offset, count_buffer, count_buffer_offset, max_count
):
# H: void f(WGPURenderPassEncoder encoder, WGPUBuffer buffer, uint64_t offset, WGPUBuffer count_buffer, uint64_t count_buffer_offset, uint32_t max_count)
libf.wgpuRenderPassEncoderMultiDrawIndexedIndirectCount(
self._internal,
buffer._internal,
int(offset),
count_buffer._internal,
int(count_buffer_offset),
int(max_count),
)


class GPURenderBundleEncoder(
classes.GPURenderBundleEncoder,
Expand Down
54 changes: 50 additions & 4 deletions wgpu/backends/wgpu_native/extras.py
Original file line number Diff line number Diff line change
Expand Up @@ -66,22 +66,68 @@ def set_push_constants(

def multi_draw_indirect(render_pass_encoder, buffer, *, offset=0, count):
"""
This is equvalent to
This is equivalent to
for i in range(count):
render_pass_encoder.draw(buffer, offset + i * 16)

You must enable the featue "multi-draw-indirect" to use this function.
You must enable the feature "multi-draw-indirect" to use this function.
"""
render_pass_encoder._multi_draw_indirect(buffer, offset, count)


def multi_draw_indexed_indirect(render_pass_encoder, buffer, *, offset=0, count):
"""
This is equvalent to
This is equivalent to

for i in range(count):
render_pass_encoder.draw_indexed(buffer, offset + i * 20)

You must enable the featue "multi-draw-indirect" to use this function.
You must enable the feature "multi-draw-indirect" to use this function.
"""
render_pass_encoder._multi_draw_indexed_indirect(buffer, offset, count)


def multi_draw_indirect_count(
render_pass_encoder,
buffer,
*,
offset=0,
count_buffer,
count_buffer_offset=0,
max_count,
):
"""
This is equivalent to:

count = min(<u32 at offset count_buffer_offset of count_buffer>, max_count)
for i in range(count):
render_pass_encoder.draw(buffer, offset + i * 16)

You must enable the feature "multi-draw-indirect-count" to use this function.
"""
render_pass_encoder._multi_draw_indirect_count(
buffer, offset, count_buffer, count_buffer_offset, max_count
)


def multi_draw_indexed_indirect_count(
render_pass_encoder,
buffer,
*,
offset=0,
count_buffer,
count_buffer_offset=0,
max_count,
):
"""
This is equivalent to:

count = min(<u32 at offset count_buffer_offset of count_buffer>, max_count)
for i in range(count):
render_pass_encoder.draw_indexed(buffer, offset + i * 20)

You must enable the feature "multi-draw-indirect-count" to use this function.
"""
render_pass_encoder._multi_draw_indexed_indirect_count(
buffer, offset, count_buffer, count_buffer_offset, max_count
)
6 changes: 3 additions & 3 deletions wgpu/resources/codegen_report.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@
* Diffs for GPUQueue: add read_buffer, add read_texture, hide copy_external_image_to_texture
* Validated 37 classes, 112 methods, 45 properties
### Patching API for backends/wgpu_native/_api.py
* Validated 37 classes, 100 methods, 0 properties
* Validated 37 classes, 102 methods, 0 properties
## Validating backends/wgpu_native/_api.py
* Enum field FeatureName.texture-compression-bc-sliced-3d missing in wgpu.h
* Enum field FeatureName.clip-distances missing in wgpu.h
Expand All @@ -35,6 +35,6 @@
* Enum CanvasAlphaMode missing in wgpu.h
* Enum CanvasToneMappingMode missing in wgpu.h
* Wrote 236 enum mappings and 47 struct-field mappings to wgpu_native/_mappings.py
* Validated 133 C function calls
* Not using 70 C functions
* Validated 135 C function calls
* Not using 68 C functions
* Validated 81 C structs
Loading