Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Documentation Suggestion (or bug?): operator locations matter #220

Closed
zwimer opened this issue Sep 8, 2022 · 6 comments
Closed

Documentation Suggestion (or bug?): operator locations matter #220

zwimer opened this issue Sep 8, 2022 · 6 comments

Comments

@zwimer
Copy link
Contributor

zwimer commented Sep 8, 2022

The confusion:
I've noticed that these two structs of code lead to different bindings, specifically the second struct generates no __eq__ method.

namespace Test {
	struct A {
		int a;
		int foo() { return 1; }
		inline bool operator==(const A& o) { return this->a == o.a; }
	};

	struct B {
		int a;
		int foo() { return 1; }
	};
	inline bool operator==(const B& a, const B& b) { return a.a == b.a; }
}

With my conf file saying only +namespace Test; which should include both equals operators above. I ran binder with the settings --config=/in/a.conf -v --root-module foo --prefix /out /in/include.hpp where include.hpp had only #include "a.hpp" in it. The relevant snippets of output are:

#include <sstream> // __str__

#include <pybind11/pybind11.h>
#include <functional>
#include <string>

#ifndef BINDER_PYBIND11_TYPE_CASTER
	#define BINDER_PYBIND11_TYPE_CASTER
	PYBIND11_DECLARE_HOLDER_TYPE(T, std::shared_ptr<T>)
	PYBIND11_DECLARE_HOLDER_TYPE(T, T*)
	PYBIND11_MAKE_OPAQUE(std::shared_ptr<void>)
#endif

void bind_unknown_unknown(std::function< pybind11::module &(std::string const &namespace_) > &M)
{
	{ // Test::A file: line:2
		pybind11::class_<Test::A, std::shared_ptr<Test::A>> cl(M("Test"), "A", "");
		cl.def( pybind11::init( [](){ return new Test::A(); } ) );
		cl.def_readwrite("a", &Test::A::a);
		cl.def("foo", (int (Test::A::*)()) &Test::A::foo, "C++: Test::A::foo() --> int");
		cl.def("__eq__", (bool (Test::A::*)(const struct Test::A &)) &Test::A::operator==, "C++: Test::A::operator==(const struct Test::A &) --> bool", pybind11::arg("o"));
	}
	{ // Test::B file: line:8
		pybind11::class_<Test::B, std::shared_ptr<Test::B>> cl(M("Test"), "B", "");
		cl.def( pybind11::init( [](){ return new Test::B(); } ) );
		cl.def_readwrite("a", &Test::B::a);
		cl.def("foo", (int (Test::B::*)()) &Test::B::foo, "C++: Test::B::foo() --> int");
	}
}

Why this is extra bad:
This would be bad in and of itself, but libraries generated using binder have operator== defined for everything by default as reference equality. That is, if the user does not scrutinize the output code but rather just verifies (in python) that == exists by checking if two As equal each other, no exception will be raised but rather the program will call a different == operator than what the user was likely expecting.

My guess as to why this happens:
The reasoning for this is sound, as C++ allows, for example, multiple different operator==(const A&a, const A&b) definitions at different namespace scopes, where as python does not. However, this behavior was not intuitive not obvious (especially as good C++ code typically moves operators outside of the class). Actually, this reasoning is just the justification I assume was used, since I couldn't find any information about this in the documentation.

The solution:
In the readthedocs, a warning should be added that operators must be defined within a class (or whatever the correct behavior is, I'm just guessing it must be within the class).

@zwimer zwimer changed the title Documentation Suggestion: operator locations matter Documentation Suggestion (or bug?): operator locations matter Sep 8, 2022
@zwimer
Copy link
Contributor Author

zwimer commented Sep 8, 2022

This is related but a bit different: It should also note that if your code defines operator==, then the python code will not only prefer that to the default == but it will also define __ne__ to be the inverse of the C++ operator==. I'm not sure if it is binder that does this or clang or what, but it is something worth noting I think.

@zwimer
Copy link
Contributor Author

zwimer commented Sep 8, 2022

On another inspection, the default __eq__ might not exactly be reference equality; it might be hash equality, where the default hash might derive from the reference but also maybe not? In any case, the point above remains the same since a default __eq__ might be defined.

@lyskov
Copy link
Member

lyskov commented Sep 19, 2022

Thank you for bringing this up @zwimer ! The issue here is that on Python level we do not have global comparison operators similar to global operators in C++ (ie global operator== in your example). So Binder is only generating code for member operator== defined in classes. Lack of mentioning this in docs is indeed an oversight so PR fixing it would be welcome!

@zwimer
Copy link
Contributor Author

zwimer commented Sep 19, 2022

@lyskov Do you happen to have a comprehensive list of operators this is true for?

For example, the << operator may be defined externally. I'm not sure how binder picks which << operator to use since there can still be multiple defined in different namespaces but minimally I've noticed that if it is defined in the same namespace as the class binder might grab it. Actually, upon closer inspection, it seems that << is not bound but rather detecting a << operator leads to generating a str function, so I suppose the selection of which << operator is still left to the compiler.

Another key difference here is likely that the << operator cannot be defined within the class though.

Is it perhaps that all operators whose first argument is *this that can be defined as member functions must be defined as member functions for binder to pick them up?

I could also add a note that users who wish to follow best practice of defining floating operator functions rather than internal operators can just have the former invoke the latter, which would still maintain all the benefits of free floating functions like implicit conversions.

@lyskov
Copy link
Member

lyskov commented Sep 19, 2022

Do you happen to have a comprehensive list of operators this is true for?

-- Binder will look only for member-function-operators for basically all operator bindings that Binder currently supports. You can find full list here: https://github.com/RosettaCommons/binder/blob/master/source/function.cpp#L43

Is it perhaps that all operators whose first argument is *this that can be defined as member functions must be defined as member functions for binder to pick them up?

-- yes.

re binding of Python __str__ operator: This indeed handled a bit differently. As you mentioned we have to look at globally defined stream operator. Relevant code could be seen here:
https://github.com/RosettaCommons/binder/blob/master/source/class.cpp#L1080 - we basically looking at the very concrete version of insertion operator and if we found it then we generate __str__ function for given class.

I could also add a note that users who wish to follow best practice of defining floating operator functions rather than internal operators can just have the former invoke the latter, which would still maintain all the benefits of free floating functions like implicit conversions.

-- This sounds like reasonable approach to me!

Thanks,

@zwimer
Copy link
Contributor Author

zwimer commented Oct 12, 2022

Documented via #229

@zwimer zwimer closed this as completed Oct 12, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants