Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Major windows-bindgen update #3359

Merged
merged 25 commits into from
Dec 9, 2024
Merged

Major windows-bindgen update #3359

merged 25 commits into from
Dec 9, 2024

Conversation

kennykerr
Copy link
Collaborator

@kennykerr kennykerr commented Dec 2, 2024

This update introduces the next major update to the windows-bindgen crate, having been overhauled to provide first-class support for custom code generation. Historically, the windows-bindgen crate was written to generate the windows crate and then the windows-sys crate. This was then further extended to support other scenarios, primarily standalone code generation. This was somewhat awkward as the windows-bindgen crate could not easily handle this and thus required some gymnastics to pull it off. This has made it difficult to fix various issues and support new scenarios.

Well this update aims to address these issues with a rewrite of the crate with an emphasis on general-purpose code generation that supports standalone code generation just as well or better than the generation of the windows and windows-sys crates. Indeed, the generation of these crates is really just a special application of the general-purpose code generator. Let's walk through a few examples to illustrate what this looks like. Consider the following build script:

fn main() {
    windows_bindgen::bindgen([
        "--in",
        "default",
        "--out",
        "src/bindings.rs",
        "--filter",
        "CoCreateGuid",
    ]);
}

This represents the general usage of the bindgen function. You'll notice that it no longer returns a Result and will simply panic with a clear message describing any issues with the arguments or environment. This highlights the fact that this is meant to be used as a build tool and so a build error makes the most sense.

The bindgen function supports various options but three are required.

--in can indicate a .winmd file or directory containing .winmd files. Alternatively, the special "default" input can be used to use the particular .winmd files that ship with the windows-bindgen crate. Previously this "default" metadata was included using a crate feature, but this new approach should be far less error-prone and predictable. If no input is provided then the "default" input is assumed.

--out indicates where the bindgen function will write the bindings. This is typically a source file like src/bindings.rs but can also point to a directory when generating a crate like windows or windows-sys.

--filter indicates what bindings to include or exclude as it's unusual to generate bindings for everything described by the input .winmd files.

The next thing you'll notice, at least if you've used previous versions of the windows-bindgen crate, is that the CoCreateGuid function or filter simply includes the function name and nothing more. In previous versions you had to spell it out in order for the bindgen function to find the metadata.

bindgen([
    "--in",
    "default",
    "--out",
    "src/bindings.rs",
    "--filter",
    "Windows.Win32.System.Com.CoCreateGuid.CoCreateGuid", // <==
]);

This continues to work but is often meaningless to many projects that don't care about the namespace hierarchy used by the .winmd files. This saves developers from having to figure out what these paths might be. If you happen to know the name of the function or type that you need then you can usually just type it in and see if the bindings can be generated correctly. Either way, the bindings will be generated the same way. There are however cases where a name might be ambiguous and then you may have to spell things out to resolve the ambiguity.

Here's what it might look like for the example above:

// Bindings generated by `windows-bindgen`

#![allow(
    non_snake_case,
    non_upper_case_globals,
    non_camel_case_types,
    dead_code,
    clippy::all
)]

pub mod Windows {
    pub mod Win32 {
        pub mod System {
            pub mod Com {
                #[inline]
                pub unsafe fn CoCreateGuid() -> windows_core::Result<windows_core::GUID> {
                    windows_targets::link!("ole32.dll" "system" fn CoCreateGuid(pguid : *mut windows_core::GUID) -> windows_core::HRESULT);
                    let mut result__ = core::mem::zeroed();
                    CoCreateGuid(&mut result__).map(|| core::mem::transmute(result__))
                }
            }
        }
    }
}

As you can see, the bindings still default to using the namespace-to-module mapping but it's easy to turn that off as well simply by using the --flat option:

bindgen([
    "--in",
    "default",
    "--out",
    "src/bindings.rs",
    "--filter",
    "CoCreateGuid",
    "--flat" // <==
]);

Previously, this flat output required both of the wordier "--config", "flatten" arguments and didn't work too well in more complex scenarios. And here's the resulting flat output:

// Bindings generated by `windows-bindgen`

#![allow(
    non_snake_case,
    non_upper_case_globals,
    non_camel_case_types,
    dead_code,
    clippy::all
)]

#[inline]
pub unsafe fn CoCreateGuid() -> windows_core::Result<windows_core::GUID> {
    windows_targets::link!("ole32.dll" "system" fn CoCreateGuid(pguid : *mut windows_core::GUID) -> windows_core::HRESULT);
    let mut result__ = core::mem::zeroed();
    CoCreateGuid(&mut result__).map(|| core::mem::transmute(result__))
}

You can also control whether the bindings include the comment and allow attribute with the --no-comment and --no-allow options.

bindgen([
    ...
    "--no-comment",
    "--no-allow",
]);

The other interesting option is --sys which instructs the bindgen function to generate raw, commonly known as sys-style, Rust bindings. Previously, this output required both of the wordier "--config", "sys" arguments.

bindgen([
    ...
    "--filter",
    "CoCreateGuid",
    "--flat",
    "--sys", // <==
]);

Here's how the CoCreateGuid function is generated in this case:

windows_targets::link!("ole32.dll" "system" fn CoCreateGuid(pguid : *mut windows_sys::core::GUID) -> windows_sys::core::HRESULT);

You'll notice that the bindings are simpler as there's no wrapper function and "core" types like GUID and HRESULT are now provided by the windows_sys::core module rather than the richer windows_core module that is used by the windows crate. This illustrates another difference that was difficult to reconcile in the previous version of the windows-bindgen crate. Previously, you could have references to the windows crate but not to the windows-sys crate. This caused all kinds of problems for sys-style bindings of custom or private APIs. Of course, there may still be cases where a dependency on either the windows or windows-sys crates are undesirable. For that you have two options. The first is to use the new --no-core option as follows:

bindgen([
    ...
    "--filter",
    "CoCreateGuid",
    "--flat",
    "--sys",
    "--no-core", // <==
]);

This instructs the bindgen function to avoid dependencies for these "core" types and instead include them directly in the generated bindings. Here's what the resulting output looks like:

windows_targets::link!("ole32.dll" "system" fn CoCreateGuid(pguid : *mut GUID) -> HRESULT);
#[repr(C)]
#[derive(Clone, Copy)]
pub struct GUID {
    pub data1: u32,
    pub data2: u16,
    pub data3: u16,
    pub data4: [u8; 8],
}
impl GUID {
    pub const fn from_u128(uuid: u128) -> Self {
        Self {
            data1: (uuid >> 96) as u32,
            data2: (uuid >> 80 & 0xffff) as u16,
            data3: (uuid >> 64 & 0xffff) as u16,
            data4: (uuid as u64).to_be_bytes(),
        }
    }
}
pub type HRESULT = i32;

As you can see, the bindgen function detects that the CoCreateGuid function depends on GUID and HRESULT and includes their definitions directly. At this point, the --no-core option requires the --sys option. I may revisit this in future but the richer non-sys core type definitions are not very easy to inline.

The other option, which applies equally well to sys and non-sys style bindings is to leverage the Rust language to inject your own definitions for such core types. Let's remove the --no-core option as follows:

bindgen([
    ...
    "--filter",
    "CoCreateGuid",
    "--flat",
    "--sys",
]);

The resulting bindings thus once again look like this:

windows_targets::link!("ole32.dll" "system" fn CoCreateGuid(pguid : *mut windows_sys::core::GUID) -> windows_sys::core::HRESULT);

The simplest thing to do here is to add a dependency for the windows-targets crate as well as the windows-sys crate and then these bindings will compile just fine. Alternatively, you can provide a local definition of windows_sys that redirects to a local definition for these core types. You can do this in various ways but one simple approach is to create an alias for windows_sys in your lib.rs file as follows:

mod bindings; // <== generated bindings

mod core {
    #[repr(C)]
    #[derive(Clone, Copy)]
    pub struct GUID {
        pub data1: u32,
        pub data2: u16,
        pub data3: u16,
        pub data4: [u8; 8],
    }

    pub type HRESULT = i32;
}

extern crate self as windows_sys;

Here I've simply defined a mini core module with the necessary type definitions and an alias for windows_sys that points right back to the current crate. You can do something similar to avoid a dependency on the windows_targets crate if so desired.

Previous versions of the windows-bindgen crate were not able to track down dependencies reliably under specific conditions. In the mode that generated crates like windows and windows-sys it was assumed that dependencies would be discovered at compile time in other modules depending on the inclusion of Cargo features. In the mode that generated standalone bindings it would generate dependencies but only if certain other options lined up just right. The new version of the windows-bindgen crate should now reliably tracks down all dependencies in all cases and regardless of the kind of output it produces.

Let's consider a struct example first and generate bindings for the InkTrailPoint struct.

bindgen([
    ...
    "--sys",
    "--filter",
    "InkTrailPoint", // <==
]);

I didn't include the --flat option so the output looks like this:

pub mod Windows {
    pub mod Foundation {
        #[repr(C)]
        #[derive(Clone, Copy)]
        pub struct Point {
            pub X: f32,
            pub Y: f32,
        }
    }
    pub mod UI {
        pub mod Composition {
            #[repr(C)]
            #[derive(Clone, Copy)]
            pub struct InkTrailPoint {
                pub Point: super::super::Foundation::Point,
                pub Radius: f32,
            }
        }
    }
}

You can observe that the InkTrailPoint struct was defined in the "Windows.UI.Composition" namespace in the input .winmd file and that it has a Point field whose type is defined in the "Windows.Foundation" namespace. The bindgen functions is careful to generate both module hierarchies and the super::super:: path to refer from the one to the other. All this happened despite the fact that the bindgen filter only mentioned "InkTrailPoint". We can of course use the --flat option to ignore the hierarchy:

bindgen([
    ...
    "--sys",
    "--filter",
    "InkTrailPoint",
    "--flat", // <==
]);

Now the output looks like this:

#[repr(C)]
#[derive(Clone, Copy)]
pub struct Point {
    pub X: f32,
    pub Y: f32,
}
#[repr(C)]
#[derive(Clone, Copy)]
pub struct InkTrailPoint {
    pub Point: Point,
    pub Radius: f32,
}

Dependencies get a lot more interesting when it comes to COM or WinRT style interfaces and classes and particularly class or interface hierarchies as those dependencies can quickly add up. Often however those dependencies can be unrelated or at least uninteresting to a given project and added burden of that cascade of dependencies can be undesirable. As such, the bindgen function will only include interface methods whose signatures refer to types that are themselves included in the --filter option.

Consider the following example using the IAsyncInfo interface:

bindgen([
    ...
    "--flat",
    "--filter",
    "IAsyncInfo", // <==
]);

Now the IAsyncInfo interface actually has five methods namely Id, Status, ErrorCode, Cancel, and Close but the resulting output omits the Status method. Here's a simplified version of the output for clarity:

impl IAsyncInfo {
    pub fn Id(&self) -> windows_core::Result<u32> {
        ...
    }
    pub fn ErrorCode(&self) -> windows_core::Result<windows_core::HRESULT> {
        ...
    }
    pub fn Cancel(&self) -> windows_core::Result<()> {
        ...
    }
    pub fn Close(&self) -> windows_core::Result<()> {
        ...
    }
}

The methods that are included are those that don't depend on anything other than intrinsics or core types. Other methods are simply omitted. But what if you need the Status method? Simply include the types needed by that method in the --filter option as follows:

bindgen([
    ...
    "--flat",
    "--filter",
    "IAsyncInfo",
    "AsyncStatus", // <==
]);

And just like that the extra Status method appears as its dependency on AsyncStatus is now available.

impl IAsyncInfo {
    pub fn Id(&self) -> windows_core::Result<u32> {
        ...
    }
    pub fn Status(&self) -> windows_core::Result<AsyncStatus> { // <==
        ...
    }
    pub fn ErrorCode(&self) -> windows_core::Result<windows_core::HRESULT> {
        ...
    }
    pub fn Cancel(&self) -> windows_core::Result<()> {
        ...
    }
    pub fn Close(&self) -> windows_core::Result<()> {
        ...
    }
}

#[repr(transparent)]
#[derive(Clone, Copy, Debug, Default, Eq, PartialEq)]
pub struct AsyncStatus(pub i32);
impl AsyncStatus {
    pub const Canceled: Self = Self(2i32);
    pub const Completed: Self = Self(1i32);
    pub const Error: Self = Self(3i32);
    pub const Started: Self = Self(0i32);
}

In some cases you may not want to generate the dependencies directly but instead defer to some other crate (or module) to provide those definitions. That's where the new --reference option comes in handy. Previously, the windows-bindgen crate would only allow dependencies on the windows crate. Now this is completely controlled by the --reference option. Let's imagine that AsyncStatus is super complicated and you'd rather get that dependency from the windows crate. This is a silly example but you can easily imagine how this can be applied to larger dependencies like Direct3D or third-party dependencies. Instead of simply including AsyncStatus in the --filter option, you can instead use the --reference option to indicate where and how to resolve the type. Here's an example:

bindgen([
    ...
    "--flat",
    "--filter",
    "IAsyncInfo",
    "--reference", "windows,skip-root,AsyncStatus", // <==
]);

This is a rather powerful feature with a great deal of flexibility so it may seem a little confusing at first but its actually quite simple. Here we're saying that the windows crate should resolve any references to the AsyncStatus type using the "skip-root" path style. The reason for this should quickly become apparent. Here's what the output looks like now:

impl IAsyncInfo {
    pub fn Id(&self) -> windows_core::Result<u32> {
        ...
    }
    pub fn Status(&self) -> windows_core::Result<windows::Foundation::AsyncStatus> {
        ...
    }
    pub fn ErrorCode(&self) -> windows_core::Result<windows_core::HRESULT> {
        ...
    }
    pub fn Cancel(&self) -> windows_core::Result<()> {
        ...
    }
    pub fn Close(&self) -> windows_core::Result<()> {
        ...
    }
}

As you can see, the Status method now refers to AsyncStatus as defined in the Foundation module within the windows crate. The skip-root path style is in reference to the fact that both the windows and windows-sys crates skip the root "Windows" namespace or module when generating their type hierarchies. Depending on how the target crate was generated you may also use the "full" or "flat" path styles instead. Obviously, you can use the --reference option to generate dependencies to crates of your own. Conversely, you can use the new windows-bindgen crate to generate those crates!

Naturally, you can use the --reference option to generate references to entire namespaces and not just specific types. The following option will work just as well and also resolve references to any other types coming from the given "Windows.Foundation" namespace.

bindgen([
    ...
    "--filter",
    "IAsyncInfo",
    "--reference",
    "windows,skip-root,Windows.Foundation", // <==
]);

Now let's look at a few more new options and features supported by the windows-bindgen crate.

The --derive option allows you to indicate extra traits that you would like specific types to derive. Imagine you need the RectF and RectInt32 structs so you generate them as follows:

bindgen([
    ...
    "--flat",
    "--sys",
    "--filter",
    "RectF",
    "RectInt32",
]);

The resulting output look like this:

#[repr(C)]
#[derive(Clone, Copy)]
pub struct RectInt32 {
    pub X: i32,
    pub Y: i32,
    pub Width: i32,
    pub Height: i32,
}
#[repr(C)]
#[derive(Clone, Copy)]
pub struct RectF {
    pub X: f32,
    pub Y: f32,
    pub Width: f32,
    pub Height: f32,
}

Now you'd like to have the RectInt32 struct implement the Eq and PartialEq traits while the RectF struct should only implement the PartialEq trait since it contains floating point fields and those can't be derived. Here's how you might do that with the new --derive option:

bindgen([
    ...
    "--filter",
    "RectF",
    "RectInt32",
    "--derive",
    "RectF=PartialEq",
    "RectInt32=PartialEq,Eq",
]);

And the resulting output looks like this:

#[repr(C)]
#[derive(Clone, Copy, Eq, PartialEq)]
pub struct RectInt32 {
    pub X: i32,
    pub Y: i32,
    pub Width: i32,
    pub Height: i32,
}
#[repr(C)]
#[derive(Clone, Copy, PartialEq)]
pub struct RectF {
    pub X: f32,
    pub Y: f32,
    pub Width: f32,
    pub Height: f32,
}

The --rustfmt option allows you to override the default Rust formatting as follows:

bindgen([
    ...
    "--rustfmt",
    "newline_style=Unix"
]);

The formatting options can be found here: https://rust-lang.github.io/rustfmt

Well there's a lot more I could say about this update but I think I'll stop before this becomes too long to read. 😉

@kennykerr
Copy link
Collaborator Author

kennykerr commented Dec 2, 2024

@riverar @ChrisDenton @wravery - would love your feedback.

A few notes on housekeeping and changes.

The undocumented and largely internal windows-metadata crate has been rolled into the windows-bindgen crate. dcfbae4

The MSRV for most crates has now moved to 1.74 to avoid some compiler bugs in previous versions. 94e1288

The BSTR type was mistakenly considered nullable/optional by previous version of windows-bindgen. That has been fixed and mainly means that you can't use None in place of an empty BSTR. 8b10d63

The lib generator now detects import functions that aren't available on x86. e5c2492

The "bindgen" tool/test crates now provide a much simpler way to test code generation. The older "standalone" tool/test still exists but will be phased out. e9772d6

@riverar
Copy link
Collaborator

riverar commented Dec 2, 2024

[...] the bindgen function will only include interface methods whose signatures refer to types that are themselves included in the --filter option.

Hmm, am concerned about this being the default behavior. It isn't easy to discover what's needed for a complete interface projection. (I haven't run the new bindgen yet, perhaps there's already a warning that is output?)

@kennykerr
Copy link
Collaborator Author

kennykerr commented Dec 2, 2024

bindgen function will only include interface methods whose signatures refer to types that are themselves included

@riverar if its any consolation this never worked before - it would blindly add all methods and then produce invalid code gen unless you manually chase down all the dependencies. We can add such an option but my attempts showed that it would be too expensive to do by default e.g. any method that mentions Xaml in any way can easily add a million lines of code... 😉 As it stands, its pretty easy to see what methods are omitted since the vtable slots are named and padded so you can tell the "Status" method is missing.

@ChrisDenton
Copy link
Collaborator

I've been trying it out and so far it works great, though I've not yet tried all the new features. Only minor issue is that the output sort order appears to have changed. E.g. here's the diff when updating the standard library to the new bindgen: rust-lang/rust@1d83a16. That said, it's not a big deal if it's a one time thing with this update.

Also, if --in is not provided, could it just fallback to working as though --in default was specified rather than erroring? I figure that's going to be the most common scenario.

@kennykerr
Copy link
Collaborator Author

Yes, I've been fiddling with the sort order. Its a little awkward as its primarily based on the PartialOrd that is derived for the Type enum so I may just write a custom implementation to make it a bit more natural.

Yes, falling back to --in default seems reasonable.

@kennykerr
Copy link
Collaborator Author

kennykerr commented Dec 3, 2024

"default" input is now the default only if no other input is provided. This is implemented and tested here: fca2260

The output is now sorted as follows:

  1. functions are placed first
  2. then type name
  3. then type namespace
  4. then architecture overloads
  5. then type overloads

This is implemented here: 01763ef

@kennykerr
Copy link
Collaborator Author

@ChrisDenton you should now find that it is closer to what you had before in terms of sort order. The functions are no longer sorted by library so that might be noticeable but expected. If you have types with different target_arch you may find their order changes but that's because that order is now stable whereas before it was arbitrary.

@ChrisDenton
Copy link
Collaborator

I was just about to say those are the exact to differences I see: rust-lang/rust@097e213. Altogether a much more manageable diff, thanks!

@kennykerr
Copy link
Collaborator Author

Sweet - thanks for testing!

@robmikh
Copy link
Member

robmikh commented Dec 3, 2024

I have a WinRT component that I've been moving to the new bindgen crate, and there's some differences between what was published before that might be worth calling out (even if by design). Additionally, there's a snag I haven't been able to figure out.
The way I used it before was to generate bindings for my interfaces/types but I still depended on the Windows crate for everything else. That doesn't seem to be the intended path anymore, as even with the "filter" option specified with my namespace, structs were still generated from the Foundation namespace and collided with the Windows crate (e.g. Point and Size).

Second, some methods weren't generated in my "*_Impl" traits. I added "Windows.Foundation.Collections.IVectorView" to my filter and that works now, but now I can't seem to leverage the helpers from the Windows crate to create collections. I haven't figured out the best way around this.

The repo and branch is here if you're curious.

@kennykerr
Copy link
Collaborator Author

kennykerr commented Dec 3, 2024

Hey Rob, briefly the --filter option is meant to include/exclude bindings from being generated. If you want to reference types in another crate then you can use the --reference option. So for example you might use something like this to reference types in the Windows.Foundation namespace from within the windows crate without necessarily generating those bindings directly:

--reference windows,skip-root,Windows.Foundation

There are some tests for this here which might help explain until I get more complete docs:

https://github.com/microsoft/windows-rs/blob/bindgen-vnext/crates/tools/bindgen/src/main.rs#L124-L135

@kennykerr
Copy link
Collaborator Author

I'll work on some samples too - that would probably be more helpful. 😊

@robmikh
Copy link
Member

robmikh commented Dec 3, 2024

Aha! Thanks, that seems to be exactly what I want. Although it seems I can't provide a namespace for the last param. Bindgen tells me it must be a type:

(input) --reference windows,skip-root,Windows.Foundation

type not found: `Windows.foundation`

I'll see if I can just list the types I need and if it would work.

Nevermind, it was a typo on my end :)

@kennykerr
Copy link
Collaborator Author

kennykerr commented Dec 3, 2024

Sorry, that was a typo - its "Foundation" not "foundation".

But then I just started working on a sample and found a little bug. 🫢 Stay tuned...

Note to self: when a type dependency is required (e.g. a required interface vs an optional method signature dependency) then the type is generated as part of the bindings even when the type can be resolved via a reference.

@kennykerr
Copy link
Collaborator Author

kennykerr commented Dec 4, 2024

OK, I've addressed the --reference issue I spotted and added a test/sample that illustrates how this can work. 8115db5

Check out the "reference" and "reference_client" test crates.

I'll do some more testing and probably write a few more samples but let me know if you have any more issues in the mean time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment