Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Passing std::string by value when calling Rust -> C++ #250

Closed
adetaylor opened this issue Aug 9, 2020 · 5 comments
Closed

Passing std::string by value when calling Rust -> C++ #250

adetaylor opened this issue Aug 9, 2020 · 5 comments

Comments

@adetaylor
Copy link
Collaborator

cxx cannot currently cope with:

#[cxx::bridge]
mod ffi {
    extern "C" {
       fn HandleString(foo: CxxString) -> bool;
    }
}

It says:

error[cxxbridge]: passing C++ string by value is not supported

What do you think about adding support for this? We need this if we're to call many of our existing C++ APIs without having to write wrappers.

To me the best option would seem to be to implicitly turn a CxxString into a &CxxString when calling from Rust to C++, then construct a new std::string from it within the generated .cc code.

That seems straightforward (unless I'm missing some complexity, which I might be!). It does introduce a slightly greater level of marshalling/unmarshalling than we currently do. Perhaps you consider this a role for the higher-level code generator discussed in #228 and #239, but I'd probably claim that this is sufficiently useful it would be handy for all cxx consumers?

Further enhancements:

  • Do the same in reverse, i.e. allow passing std::string by value into Rust APIs.
  • (Hard but in our case necessary) Support passing structs by value, where the struct incorporates std::strings. This will require marshalling (say)
struct Origin {
   struct HostPortSchemeTuple {
      uint16_t port;
      std::string host;
      std::string scheme;
   } host_port_scheme_tuple;
}

into

struct Origin_$forTransfer {
   struct HostPortSchemeTuple_$forTransfer {
      uint16_t port;
      const std::string& host;
      const std::string& scheme;
   } host_port_scheme_tuple;
}

then unmarshalling at the other end by reconstructing real std::strings. That sounds absolutely horrid, but it's definitely something we're going to need to do if we wish to allow a significant fraction of our existing C++ APIs to be called fluidly. I must admit it's quite hard for me to make the case that this should be in core cxx rather than a higher-level wrapper.

@dtolnay
Copy link
Owner

dtolnay commented Aug 9, 2020

To me the best option would seem to be to implicitly turn a CxxString into a &CxxString when calling from Rust to C++, then construct a new std::string from it within the generated .cc code.

That seems straightforward (unless I'm missing some complexity, which I might be!).

The non-straightforward part is whether a std::string by value can even legally exist in Rust. If it can, it's straightforward to support, which would make string arguments and structs containing strings both work.

The problem is with hypothetical std::string implementations that lay out as follows to take advantage of not needing a branch for dereferences of the data pointer:

long strings             short strings
   +---+                     +---+
   |ptr to heap              |ptr to data
   +---+                     +---+
   |length                   |data
   +---+                     |   |
   |capacity                 |   |
   +---+                     +---+

A std::string by value in Rust could only be allowed in environments that do not do SSO or do SSO not this way.

Do you know if:

  1. A std::string implementation that has an internal pointer is allowed by the standard?
  2. This is ever a thing that real standard library implementations do?
  3. Your specific standard library does it?

@adetaylor
Copy link
Collaborator Author

Yes. So far as I know that is indeed something that our standard library does (and even if it didn't, I wouldn't want to guarantee that it wouldn't in future).

I suppose, however, I was thinking of a CxxString as being little more than a handle to a C++-side object which is allocated, freed and manipulated solely from C++ code (and every operation on the Rust type actually simply calls through to some C++ code). So, the Rust-side CxxString would secretly just be a pointer to a C++-side std::string. I can't see any other safe way to do it without becoming dependent on implementation details of the C++ standard library.

(Thinking out loud as you can tell!)

@dtolnay
Copy link
Owner

dtolnay commented Aug 9, 2020

All the binding types are exactly what exists in the other language, they are not handles.

We used to use handles in my work codebase and it was a bad experience. It falls apart when you want to call a method that takes &'a HANDLE in one language but all you have is &'a ACTUAL in the other language; you end up not being able to come up with a handle with the appropriate lifetime, and need to start inventing mutually incompatible Foo and FooRef and FooMut versions of all types.

In cxx the equivalent of a handle to a std::string is UniquePtr<CxxString>, which is allowed to be passed by value.

@adetaylor
Copy link
Collaborator Author

OK. Yes, I know they're not handles right now, but if we assume it's impossible to represent std::string in Rust then I figured that a handle might be better than nothing.

But yes. Your point on lifetimes and experience using them is appreciated, and I was worried it might head in that direction too. I'll do some thinking. It's very desirable for us to be able to call existing C++ APIs from Rust even if they take a std::string by value.

(One solution I already discounted is to use Pin, since although it's designed for self-referential structs, the whole point is that I want to be able to pass std::strings by value.)

In cxx the equivalent of a handle to a std::string is UniquePtr<CxxString>, which is allowed to be passed by value.

Maybe the hypothetical higher-level code generator always generates a C++ wrapper function which takes a UniquePtr<CxxString> from Rust if the original C++ function took a std::string. This is roughly what I was originally driving at, but more explicit.

@dtolnay
Copy link
Owner

dtolnay commented Aug 9, 2020

Maybe the hypothetical higher-level code generator always generates a C++ wrapper function which takes a UniquePtr<CxxString> from Rust if the original C++ function took a std::string.

That could work! A step further in that direction to make it seamless would be:

extern "C" {
    fn TakeString(s: CxxString);
}

// becomes callable as:
fn TakeString(s: impl IntoCxxString);

where we have impls to make it work for &str, String, &CxxString, UniquePtr<CxxString>, etc. This would sort of imitate the implicit conversions or implicit copy construction you would get in C++ callers while still being able to move in UniquePtr<CxxString> if the caller has one available.

TakeString("...");

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants