Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide builtin handling of exposing a C++ constructor #280

Open
dtolnay opened this issue Sep 4, 2020 · 5 comments
Open

Provide builtin handling of exposing a C++ constructor #280

dtolnay opened this issue Sep 4, 2020 · 5 comments

Comments

@dtolnay
Copy link
Owner

dtolnay commented Sep 4, 2020

To the extent that constructors "return" a C++ type by value, they're not translatable to Rust because Rust moves (memcpy) are incompatible with C++ moves (which can require a move constructor to be called). Translating an arbitrary constructor to fn new() -> Self would not be correct.

The current workarounds are:

  • You can bind them unsafely with bindgen which assumes moving without a constructor call is okay;
  • You can use the "shared struct" approach when possible, which is safely movable in either language; or
  • You can include! a shim which does the construction behind a unique_ptr or similar.

That last approach would look something like:

// suppose we have a struct with constructor `ZeusClient(std::string)`

// in a C++ header:
std::unique_ptr<ZeusClient> zeus_client_new(rust::Str arg);

// in the corresponding C++ source file:
std::unique_ptr<ZeusClient> zeus_client_new(rust::Str arg) {
  return make_unique<ZeusClient>(std::string(arg));
}
// in the Rust cxx bridge:
extern "C++" {
    include!("path/to/zeus/client.h");
    include!("path/to/constructorshim.h");

    type ZeusClient;

    fn zeus_client_new(arg: &str) -> UniquePtr<ZeusClient>;
}

However, it would be very possible for us to handle at least the special case of types without a move constructor. In the following snippet, CXX would need to: recognize the new function name since that isn't otherwise a legal C++ function name, identify the Self type based on the extern type in return position of the function signature, statically verify there is no move constructor, and emit the constructor as an associated function ZeusClient::new to Rust.

extern "C++" {
    type ZeusClient = path::to::ZeusClient;  // the local def with a cxx::ExternType impl, via bindgen or handwritten

    fn new(arg: &str) -> ZeusClient;  // will verify there is no move constructor
}

Additionally we can consider supporting implicit UniquePtr construction of types which may or may not have a move constructor, with or without an ExternType impl.

extern "C++" {
    type ZeusClient;

    fn new(arg: &str) -> UniquePtr<ZeusClient>;  // okay even with move constructor
}
@adetaylor
Copy link
Collaborator

I would totally love it cxx did this directly. I know you've previously hinted that this should be in the higher-level code generator, but I always hoped you might change your mind and see a home for constructors. It's possible I'll get around to doing something here, though it's unlikely to be for a week or two, and I suspect you'll get there first.

@dtolnay
Copy link
Owner Author

dtolnay commented Sep 4, 2020

The prospect that I am interested in exploring with this is whether we can let writing a cxx::ExternType impl give you complete free reign with a type, as if it were defined natively in Rust all along. It wasn't satisfying to say we just can't let you call the constructor because "your type might have a move constructor" and "we don't parse your header so we don't know the size of your type" when the situation is "but it doesn't" and "but I can promise (unsafely if necessary) that it's N bytes".

On the UniquePtr one, I am not fully sold at this point but I can see that it's extremely convenient, so I've mostly come around and would accept a PR on it -- but I'd like it to be after I finally get around to reviewing the #[rust_name = "..."] PR (#218) so that one can rename the Rust constructor to MyType::make_unique or whatever as appropriate. That has been next on my priorities after #276 and next to #179.

@adetaylor
Copy link
Collaborator

This issue states that it's safe to make a constructor, which returns a type by value, for a type without a move constructor.

i.e.

fn new(arg: &str) -> ZeusClient;  // will verify there is no move constructor

I don't think that's exactly right - counterexample:

#include <string>

class Contained {
public:
    std::string a;
    uint16_t b;
};

class ZeusClient {
public:
    Contained a;
};

Anything containing a std::string is not safe to represent by value within Rust, because std::string contains a potentially self-referential pointer.

But, I assume your plans for static assertions here involve something like std::is_trivially_move_assignable (TIL!) which seems to do the trick:

#include <iostream>
#include <string>
#include <type_traits>

class SafeStruct {
public:
    uint16_t a, b;
};

class SafeContainer {
public:
    SafeStruct a;
};

class Contained {
public:
    std::string a;
    uint16_t b;
};

class ZeusClient {
public:
    Contained a;
};

int main(void) {
    std::cout << "uint16_t " << std::is_trivially_move_assignable<uint16_t>::value << std::endl;
    std::cout << "string " << std::is_trivially_move_assignable<std::string>::value << std::endl;
    std::cout << "SafeStruct " << std::is_trivially_move_assignable<SafeStruct>::value << std::endl;
    std::cout << "SafeContainer " << std::is_trivially_move_assignable<SafeContainer>::value << std::endl;
    std::cout << "Contained " << std::is_trivially_move_assignable<Contained>::value << std::endl;
    std::cout << "ZeusClient " << std::is_trivially_move_assignable<ZeusClient>::value << std::endl;
    return 0;
}

which prints:

uint16_t 1
string 0
SafeStruct 1
SafeContainer 1
Contained 0
ZeusClient 0

I have not looked up the C++ spec to find out if "trivially move assignable" means what it sounds like it means.

I assume this is what you meant all along anyway, but as I dug into it a little, I thought I ought to post my notes here.

@dtolnay
Copy link
Owner Author

dtolnay commented Sep 22, 2020

Thanks for clarifying -- yeah that's what I had in mind, "no move constructor" ≈≈ "no user-defined move constructor or nontrivial implicitly defined move constructor".

Rather than is_trivially_move_assignable I think what we want is is_trivially_move_constructible && is_trivially_destructible because every Rust move will conceptually trivially move construct the value in the new place and trivially destroy the old place, from C++'s perspective.

Here is a counterexample which is trivially_move_assignable but would be unsound to move in Rust:

struct Struct {
  Struct();
  ~Struct();
};

static Struct Default;
static Struct *Ptr = &Default; // invariant: always points to valid struct!!

Struct::Struct() { Ptr = this; }
Struct::~Struct() { Ptr = &Default; }

@ahayzen-kdab
Copy link
Contributor

People may have already found this but note that C++ templates can be used so that methods don't need to be created for every constructor of a C++ type.

Eg

template<typename T, typename... Args>
T
construct(Args... args)
{
  return T(args...);
}

Could be used then with fn construct() -> T; or fn construct(x: i32, y: i32) -> T; (then use the #[rust_name = "construct_t"] to rename the methods for Rust so they don't collide).

Obviously you also need to ensure that the T has the right trivial vs opaque correctness with asserts, you could also have another template that returns eg a UniquePtr<T>.

But this greatly reduces the C++ code if you need to define many constructors or similar methods for types and can sortof be used as a workaround until CXX does have proper a way of describing constructors/deconstructors without the C++ glue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants