Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Default descriptor initialization (and discussion of C vs C++ callbacks) #158

Closed
1 task
mrshannon opened this issue Jun 2, 2022 · 34 comments · Fixed by #422
Closed
1 task

Default descriptor initialization (and discussion of C vs C++ callbacks) #158

mrshannon opened this issue Jun 2, 2022 · 34 comments · Fixed by #422
Labels
has resolution Issue is resolved, just needs to be done

Comments

@mrshannon
Copy link

mrshannon commented Jun 2, 2022

Many of the descriptors in WebGPU have default values, and more importantly default values which are not 0 and so a simple memset or zero initialization is not appropriate.

An example of what does not work

WGPUTextureDataLayout TextureDataLayout = {0};
TextureDataLayout.bytesPerRow = 512;

In this case rowsPerImage = 0 instead of WGPU_COPY_STRIDE_UNDEFINED.

Therefore, I propose adding default initialization defines to webgpu.h. While it is possible for each user to create their own, there are some benefits to implementing this in webgpu.h:

  • Adding fields later with default values to descriptors does not introduce undefined behavior in existing code.
  • It is possible to change default values later without breaking existing code (while WGPU_COPY_STRIDE_UNDEFINED fixes this, many defaults do not have a define as they are simply 0).
  • One source of truth

One way to implement this is to add defines for each descriptor:

#define WGPUTextureDataLayout_DEFAULT { NULL, 0, WGPU_COPY_STRIDE_UNDEFINED, WGPU_COPY_STRIDE_UNDEFINED }

This would allow users to write:

WGPUTextureDataLayout TextureDataLayout = WGPUTextureDataLayout_DEFAULT;
TextureDataLayout.bytesPerRow = 512;

TODO list:

@Kangz
Copy link
Collaborator

Kangz commented Jun 3, 2022

This would have helped us a lot in Chromium at various time when removing / deprecating members :) I could change the generator to produce these defines at the bottom of the webgpu.h (guarded with a #ifndef WGPU_SKIP_STRUCT_DEFAULT_MACROS or something).

Other folks WDYT? In particular @cwfitzgerald @kvark @jimb?

Also @mrshannon, you could use the C++ headers generated by Dawn, they have default initialization for the various structs.

@grovesNL
Copy link
Member

grovesNL commented Jun 3, 2022

This sounds reasonable to me. We use defaults for a lot of types in wgpu and find it really useful, e.g. PrimitiveState https://github.com/gfx-rs/wgpu/blob/77b2c25684f971daa977596403f01c71d32ea685/wgpu/examples/msaa-line/main.rs#L65-L69

@mrshannon
Copy link
Author

Also @mrshannon, you could use the C++ headers generated by Dawn, they have default initialization for the various structs.

Sort of off topic, but we were initially using the C++ headers but dropped them for the C headers as we kept having to cast everywhere because the callbacks always use the C types and not the C++ types. But that came with it's own issues (WGPUBufferUsage vs WGPUBufferUsageFlags), maybe it's time to consider switching back.

@Kangz
Copy link
Collaborator

Kangz commented Jun 3, 2022

Any idea on how to improve WGPUBufferUsage vs. WGPUBufferUsageFlags?

For the C++ header typedefs I'm not sure how to fix it because technically the signatures of the functions are different so if we want UBSan to be happy we would need to make an indirection (and allocation).

@mrshannon
Copy link
Author

Any idea on how to improve WGPUBufferUsage vs. WGPUBufferUsageFlags?

You technically don't need WGPUBufferUsageFlags as WGPUBufferUsage will always be wide enough for the bitwise combination because we have WGPUBufferUsage_Force32 = 0x7FFFFFFF which is effectively doing what is done for the C++ headers with:

enum class BufferUsage : uint32_t {

For the C++ header typedefs I'm not sure how to fix it because technically the signatures of the functions are different

Correct, I don't think there is either, not without allocating a std::function on the heap and destroying it later (not a good idea). But what could be done is to make it clear (without looking at implementation files) that the C and C++ enums are safe to cast by providing a casting function. In our engine we do all kinds of seemingly "unsafe" casting but it is behind a type-trait like function which always performs the "safe" cast.

I am also not sure it makes sense to have:

using BufferMapCallback = WGPUBufferMapCallback;

Because without that it's at least clear the callbacks will always be using the C API.

But again, this is way off topic (with relation to the initial issue), so perhaps we should continue this somewhere else?

@kainino0x
Copy link
Collaborator

kainino0x commented Jun 3, 2022

You technically don't need WGPUBufferUsageFlags as WGPUBufferUsage will always be wide enough for the bitwise combination because we have WGPUBufferUsage_Force32 = 0x7FFFFFFF which is effectively doing what is done for the C++ headers with:

EDIT: Actually let me split that out into a separate issue. #159

@kainino0x
Copy link
Collaborator

I split the flags discussion out to #159, since it's mostly off-topic.

@Kangz
Copy link
Collaborator

Kangz commented Jun 7, 2022

@mrshannon we could have a wgpu::FromC(...) and wgpu::ToC(...) serie of functions that convert between the C and C++ datatypes. WDYT?

@mrshannon
Copy link
Author

@mrshannon we could have a wgpu::FromC(...) and wgpu::ToC(...) serie of functions that convert between the C and C++ datatypes. WDYT?

I think that's a good idea. Makes it clear without looking at the implementation that those casts are safe.

@kainino0x
Copy link
Collaborator

For the C++ header typedefs I'm not sure how to fix it because technically the signatures of the functions are different so if we want UBSan to be happy we would need to make an indirection (and allocation).

This is kind of a dumb idea, but what if the C API had two userdata pointers, and the C++ wrapper used one for the C++ callback? I can imagine this being useful for other language bindings, as well. Would look vaguely like:

namespace wgpu {
    typedef void (*RequestDeviceCallback)(RequestDeviceStatus status, Device device,
            char const * message, void * userdata);

    void RequestDeviceWrapper(WGPURequestDeviceStatus status, WGPUDevice device,
            char const * message, void * userdata, void * binding_userdata) {
        RequestDeviceCallback callback = static_cast<RequestDeviceCallback>(binding_userdata);
        callback(static_cast<RequestDeviceStatus>(status), Device{device}, message, userdata);
    }
    void Adapter::RequestDevice(DeviceDescriptor const * descriptor, RequestDeviceCallback callback,
            void * userdata) const {
        wgpuAdapterRequestDevice(Get(), reinterpret_cast<WGPUDeviceDescriptor const * >(descriptor),
                RequestDeviceWrapper, reinterpret_cast<void * >(userdata), callback);
    }
}

@kainino0x kainino0x changed the title Default descriptor initialization Default descriptor initialization, and C vs C++ callbacks Jun 7, 2022
@mrshannon
Copy link
Author

mrshannon commented Jun 9, 2022

Here is an example of WebGPU INIT/NULL defines we use for our project, pulled from our WebGPU utility library. It's not quite the same as the defaults in the C++ API provided by Dawn/Emscripten as it's defaults are based on the JS API and includes NULL initialization (which is required for some of the C structs). But it shows some the reasons why upstreaming this is preferred, differences between implementations which we had to paper over with #ifdef.

@kainino0x
Copy link
Collaborator

I'm ever so slightly sad that our default structs aren't just all zeroes. I'm just spitballing - apparently this is the thread where I come up with silly C API ideas - but how awful would it be if instead of sentinel 0xFFFF_FFFF values we had a separate field to specify whether you've set the thing or not? (Basically a "type" field for whether it's undefined or int.)

This would also mean implementations must implement the same defaults that are specified in WebIDL, i.e. turning _Undefined values into their defaults, but I kinda think that is a good idea anyway.

@mrshannon
Copy link
Author

apparently this is the thread where I come up with silly C API ideas

I am honored to have started it.

separate field to specify whether you've set the thing or not?

One option for this is for optional numbers the type could be changed to something like:

strict OptionalUint32 {
    bool set;
    uint32_t value;
}

This way you could do:

WGPUTextureViewDescriptor desc = {0};
desc.baseArrayLayer = {true, 4};
desc.arrayLayerCount = {true, 1};

It's interesting, and much more self documenting. But it does add quite a bit of complexity when setting optional numbers.

I think the bigger issue is with the nullable structs which are not NULL at the top but are NULL because something inside them is _Undefined. To me that is the bigger issue as it makes it very hard to understand what is going on. I think I would rather see something similar to blend in WGPUColorTargetState where the sub structure is given by pointer, as that is far more self documenting.

@kainino0x
Copy link
Collaborator

kainino0x commented Jun 24, 2022

Something just occurred to me, which is that if we wanted zero initialization to give the default value then we'd need this {set, value} pattern not just for all of the number-or-undefined members, but also for all of the non-enum-type members that have non-zero default values in WebIDL (listed below). That would be kind of painful. The INIT structs are likely a better idea.

  • mipLevelCount. don't want to make 0 mean "default value" unless the JS spec changes to make 0 an exception
  • GPUTextureDescriptor.sampleCount, GPURenderPassLayout.sampleCount, GPUMultisampleState.count. same
  • maxDrawCount. same
  • GPUCanvasConfiguration.usage. same
  • maxAnisotropy. same, or change JS spec to allow 0 (by clamping it to 1)
  • lodMaxClamp. 0 is a valid value so it can't mean "default value"
  • GPUMultisampleState.mask, GPUColorTargetState.writeMask, stencilReadMask, stencilWriteMask. same
  • WGPUExtent3D.height/depthOrArrayLayers: just make these required in the C API

@beaufortfrancois
Copy link
Contributor

This is causing some confusion/issues amongst developers. Shall we resume investigation work?

@kainino0x kainino0x added the !discuss Needs discussion (at meeting or online) label Jun 30, 2023
@kainino0x
Copy link
Collaborator

I assume you're asking about the C vs C++ callbacks specifically. I'm still working on a rework of the async APIs but the callback signature should be an orthogonal decision. I'll add it to the agenda but we won't meet until 2 weeks from now, so if folks on this thread + @rajveermalviya + @cwfitzgerald can chime in maybe we can agree on either:

  • We keep it the way it is, C++ and other bindings will need an extra memory allocation somewhere to implement all of the callbacks (either one malloc/free per callback, or something cleverer)
  • We add a second userdata pointer which should be enough for most cases (and when it isn't, bindings implementation can use an extra memory allocation)

@kainino0x
Copy link
Collaborator

kainino0x commented Jun 30, 2023

I'm in favor of the second userdata pointer, described in #158 (comment)

@kainino0x
Copy link
Collaborator

The malloc/free (new/delete) version would look like this:

namespace wgpu {
namespace {

struct RequestDeviceParameters {
    RequestDeviceCallback callback;
    void * userdata;
};

void RequestDeviceWrapper(WGPURequestDeviceStatus status, WGPUDevice device,
        char const * message, void * userdata) {
    RequestDeviceParameters* p = static_cast<RequestDeviceParameters*>(userdata);
    p->callback(static_cast<RequestDeviceStatus>(status), Device{device}, message, p->userdata);
    delete p;
}

}  // namespace

void Adapter::RequestDevice(DeviceDescriptor const * descriptor, RequestDeviceCallback callback,
        void * userdata) const {
    RequestDeviceParameters* p = new RequestDeviceParameters{callback, userdata};
    wgpuAdapterRequestDevice(Get(), reinterpret_cast<WGPUDeviceDescriptor const * >(descriptor),
            RequestDeviceWrapper, p);
}

}  // namespace wgpu

@rajveermalviya
Copy link
Collaborator

Other language bindings would definitely benefit from second userdata pointer, so +1 for that.

@kainino0x
Copy link
Collaborator

today's meeting: Dawn to give this some more thought, in particular if we have an API with that accepts a C++ lambda (as a std::function) instead of a userdata pointer, the second userdata most likely doesn't even help. We should prototype that if it's something we want.

@kainino0x
Copy link
Collaborator

but no one was particularly opposed to adding the second userdata so if we think it's useful we should just add it.

In particular I'm thinking the C++ API might want to keep both the lambda and raw-function interfaces, since raw-function will be faster.

@kainino0x
Copy link
Collaborator

We may also be able to make the raw-function version nicer to use by templating the userdata so it doesn't have to always be void*.

@beaufortfrancois
Copy link
Contributor

It's great to see some activity in this space!

FYI Here's the Dawn bug.

@kainino0x
Copy link
Collaborator

kainino0x commented Jul 18, 2023

I've been sketching out code in this gist:
https://gist.github.com/kainino0x/4f04609d91111b360cb4be93822b2f68

and have a new idea:

    typedef void(*WGPUCallbackV)(WGPUSomething, size_t userdataSize, void* userdataPtr);
    void wgpuCallCallbackV(WGPUSomething, WGPUCallbackV, size_t userdataSize, void* userdataPtr);

The "V" in this prototype is for "variable sized userdata". Instead of a single pointer, we take a whole array of raw data of variable size and we'll pass it back to you. This way the allocation is handled by the webgpu.h implementation and it may be able to do it more intelligently than just new/delete.

@kainino0x
Copy link
Collaborator

@Kangz on the dawn issue:

What about alignment constraints of the copied userdata in this case? It seems a bit overkill to have a variable sized userdata when using the API in C but it could be useful for other bindings and is a more general solution. So IDK :/

Probably we can just provide a large alignment guarantee. alignof(std::max_align_t) (or naive malloc) seems to be the way.

@kainino0x
Copy link
Collaborator

meeting: discussed this and it would be best to provide support for larger alignments than max_align_t. It's possible for the C++ bindings to detect an underaligned allocation and reallocate with logic like:

if constexpr (alignof(T) > alignof(max_align_t)) {
    if (reinterpret_cast<uintptr_t>(userdataPtr) % alignof(T) != 0) {
        reallocate
    }
}

And it's also possible to static_assert(alignof(T) <= alignof(max_align_t)).
But better just make the C API do it for us. I've updated the gist.

Still, didn't come to a conclusion yet on whether this is better than just making the bindings responsible for the allocation.

@kainino0x
Copy link
Collaborator

@cwfitzgerald also raised a question about whether it's even valid to be memcpying a lambda around. I tried to research this and found that it's probably fine if it's is_trivially_copyable, but otherwise I can't find an answer. We'd need to move the object into C's raw memory and then move it back out later, which I don't know how to do in C++. And on top of that I'm not sure whether it's valid to memcpy it around so its address changes between the move-in and move-out.

It's complicated enough to know whether this is even possible that we're leaning toward just abandoning the idea for now.

The double userdata still has some value, for no-allocation entrypoints (see CallCallback2 and CallCallback2_T in the gist).

@Kangz
Copy link
Collaborator

Kangz commented Jul 22, 2023

It seems it could be possible with placement new and friends. Maybe @amaiorano would know? Here we're trying to memcpy a non-trivially copyable lambda to an allocation managed by the wgpu runtime. But maybe it could be moved?

@kainino0x
Copy link
Collaborator

It would only be possible for us to move it in C++ if the C API gave us the space to write into. If we just pass a size and pointer to the C++ object, then C is the one actually moving it. So the question becomes whether you can move the data for a non-trivially-copyable object around without calling its move constructor. It seems likely no.

@kainino0x
Copy link
Collaborator

kainino0x commented Jul 27, 2023

(EDIT: migrated to #201)

@kainino0x kainino0x added has resolution Issue is resolved, just needs to be done and removed has resolution Issue is resolved, just needs to be done labels Jul 28, 2023
@kainino0x
Copy link
Collaborator

kainino0x commented Jul 28, 2023

Moved the callback thing to a new issue #201 so I can label it correctly. Better late than never :)

@kainino0x kainino0x changed the title Default descriptor initialization, and C vs C++ callbacks Default descriptor initialization (and discussion of C vs C++ callbacks) Jul 28, 2023
@kainino0x kainino0x removed the !discuss Needs discussion (at meeting or online) label Aug 3, 2023
@kainino0x
Copy link
Collaborator

kainino0x commented Oct 12, 2023

To finally return to the original topic...

webgpu.h meeting: Yes, we should add these macros. They seem better than any other options (like trying to define a static, or calling a function like POSIX).

We would like them to be well-typed to provide decent error messages to users. @rajveermalviya suggested the macro could be like (WGPUAdapterProperties) { 0 }. However I tested this and it turns out not to be allowed in standard C++ 🙁:

<source>:5:13: warning: compound literals are a C99-specific feature [-Wc99-extensions]
    5 |     A a2 = ((A) { 4 }); // Only allowed in non-standard C++

WGPUAdapterProperties { 0 } is invalid in C and only becomes valid in C++ with C++11.

So I think to do this we'd need something like this (full sample):

#if defined(__cplusplus)
#  if __cplusplus >= 201103L
#    define WGPU_STRUCT_INIT_PREFIX(T) T
#  else
#    define WGPU_STRUCT_INIT_PREFIX(T)
#  endif
#elif defined(__STDC_VERSION__) && __STDC_VERSION__ >= 199901L
#  define WGPU_STRUCT_INIT_PREFIX(T) (T)
#else
#  define WGPU_STRUCT_INIT_PREFIX(T)
#endif

typedef struct A { int x; } A;
#define A_INIT WGPU_INIT_PREFIX(A) { /* .x = */ 0 }

@kainino0x
Copy link
Collaborator

kainino0x commented Oct 24, 2023

webgpu.h meeting: we'll do the macro thing, but tweak it so we can wrap parentheses around the typed versions (Just In Case):

#if defined(__cplusplus)
#  if __cplusplus >= 201103L
#    define WGPU_STRUCT_INIT_INCANTATION(type, value) (type value)
#  else
#    define WGPU_STRUCT_INIT_INCANTATION(type, value) value
#  endif
#elif defined(__STDC_VERSION__) && __STDC_VERSION__ >= 199901L
#  define WGPU_STRUCT_INIT_INCANTATION(type, value) ((type) value)
#else
#  define WGPU_STRUCT_INIT_INCANTATION(type, value) value
#endif

typedef struct A { int x; } A;
#define A_INIT WGPU_STRUCT_INIT_INCANTATION(A, { /* .x = */ 0 })

Strawperson names. Full sample

@kainino0x kainino0x added the has resolution Issue is resolved, just needs to be done label Oct 24, 2023
@kainino0x
Copy link
Collaborator

Dec 21 meeting

  • INIT structs with designated initializers (it doesn’t seem to be possible). Should we have “constructors” for common cases or anything..? (could be post-v1)
  • Only applies to the C API. C++ designated initializers get defaults (we’re pretty sure)
  • (Can’t think of any good solutions. Just init first then set the fields afterward)
  • This is fine, leave as is. Only a problem with the C API.

@kainino0x kainino0x linked a pull request Nov 21, 2024 that will close this issue
5 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
has resolution Issue is resolved, just needs to be done
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants