-
Notifications
You must be signed in to change notification settings - Fork 12.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
#[no_mangle] is unsafe in the presence of name collisions #28179
Comments
I think this even leads to a linker error on some platforms (depending on the situation), so in that sense I'd also be fine just having a set of all |
You can
This makes the playpen timeout on stable 1.2: http://is.gd/sxdUl3 |
Yeah, I'd argue that having multiply-defined |
I doubt there’s anything sensible (other than writing our own linker) that can be done here to truly fix the issue. I’m not against the idea of making less of these cases possible, though. e.g. #[no_mangle]
fn main(){} doesn’t work already. But most of the proposed solutions (exception being keeping more symbols in memory) sound like a non-option to me. |
|
@arielb1 i didn’t say anything about safety; only about our capability to do anything about this issue. Either way, if you have to be wary of code you control unknowingly hijacking other code or doing evil things, something went very wrong where Rust (or any language, really) can’t help. |
JS is designed for that use-case. Rust is theoretically supposed to fill that role too, but both rustc and the type-system are way not sufficiently trustworthy for that. |
I agree that unmangled symbol collisions should be a compilation error. Breakage here is extraordinarily unlikely, and the only code that it could break is code that won't be working as intended anyway, which is exactly the sort of the code that we ought to break. |
Reassigning to lang team. What a pain. |
triage: P-low Seems like a real problem, if not that concerning. @rust-lang/lang feels that the fix proposed by @alexcrichton in this comment is the correct one. Check at link time for duplicate |
Why is calling Also |
They can't do anything that normal functions can't do. And there isn't in theory any way to use them to violate safety (this issue is a bug; it would be totally fine in theory for
As far as I can tell, this is basically because naked functions have no guaranteed That is, the reason |
It's not the function itself that is unsafe but the call. The call to something unmangled is FFI-ish, therefore looks (and is) unsafe.
|
There are some other examples of abusing For example, this hack overrides #![no_main]
#[link_section=".text"]
#[no_mangle]
pub static main: [u32; 9] = [
3237986353,
3355442993,
120950088,
822083584,
252621522,
1699267333,
745499756,
1919899424,
169960556,
]; Sigh. |
|
In this comment, @dtolnay brings up the idea of "unsafety at the item level". I feel like that is also relevant for this issue here, which basically concerns an unsafe function-level attribute. |
|
This issue will be resolved by RFC 3325 if it is accepted and stabilized. |
I think this is just a compiler bug. Am I missing something? On ELF for example, if we linked Speculating a bit more, there's a chance that weak binding was a convenient way to allow multiple codegen units to instantiate the same function, but I think we could prevent that pretty easily for I would be surprised if the equivalent linker behavior did not exist in the executable formats of every major platform we support. |
A linker will trawl through static libraries looking for missing symbols. Once it finds a relevant symbol it essentially extracts that object file from the static library then stops looking. Similarly, if a symbol is already resolved it won't go looking in static libraries for others of the same name. In this scenario we can always use |
I believe most linkers have options that can force them to link an object file. But yes, there are still opportunities for interactions when there's a global symbol namespace. The important thing for Rust is that the behavior is predictable and consistent. |
Sure but we can't make it a safe attribute? Even if we can guard against specific issues. |
Overriding Here's a trivial example of that: fn main() {
println!("Hello!");
}
#[no_mangle]
pub fn malloc() -> usize { 1 } i.e. safe code that crashes with a segfault. |
I agree that the attribute still needs to be unsafe because of its potential interactions with native code, the ability to override the same symbol in shared libraries, and other potential weird linking behavior the compiler can't check. That said, I think we could remove a lot of the fangs with thoughtful consideration of better linking defaults. That would need to come with ways of overriding them via |
Super exciting to see this issue from 2015 / Rust 1.2 resolved - thanks to everyone who made it happen! I agree that the unsafety check is the best that can be done here. Regarding the comments about static linking - it appears to me that there's a distinction between linking an object file and linking an archive file containing that exact same object file. The linker complains about collisions in $ cat main.c
#include <stdio.h>
extern int collision(void);
int main(void) {
printf("%d\n", collision());
}
$ cat collide1.c
int collision(void) {
return 1;
}
$ cat collide2.c
int collision(void) {
return 2;
}
$ cc -c collide1.c
$ cc -c collide2.c
$ cc -o main main.c collide1.o collide2.o
/usr/bin/ld: collide2.o: in function `collision':
collide2.c:(.text+0x0): multiple definition of `collision'; collide1.o:collide1.c:(.text+0x0): first defined here
collect2: error: ld returned 1 exit status
$ ar rcs libcollide1.a collide1.o
$ ar rcs libcollide2.a collide2.o
$ cc -o main main.c -lcollide1 -lcollide2 -L.
$ ./main
1
$ cc -o main main.c -lcollide2 -lcollide1 -L.
$ ./main
2
$ readelf -s collide1.o | grep collision
9: 0000000000000000 15 FUNC GLOBAL DEFAULT 1 collision
$ readelf -s libcollide1.a | grep collision
9: 0000000000000000 15 FUNC GLOBAL DEFAULT 1 collision (GCC 9.4 and ld 2.34 as shipped in Ubuntu 20.04) But you do get the error if you use $ cc -o main main.c -Wl,--whole-archive -lcollide2 -lcollide1 -Wl,--no-whole-archive -L.
/usr/bin/ld: ./libcollide1.a(collide1.o): in function `collision':
collide1.c:(.text+0x0): multiple definition of `collision'; ./libcollide2.a(collide2.o):collide2.c:(.text+0x0): first defined here
collect2: error: ld returned 1 exit status GNU ld's documentation of that option says that it makes the linker "include every object file in the archive in the link, rather than searching the archive for the required object files." I'm interpreting that to mean that with the two static archives, it decides that it's found This seems like relatively intentional behavior on the part of the linker, and I'm not sure it would work / be a good idea for rustc to use |
On some platforms (at least GNU/Linux, but I hear Windows and several others too), if you link together two static libraries that both export a symbol of the same name, it's undefined which symbol actually gets linked. In practice on my machine, the first library seems to win. This lets you defeat type/memory safety without the
unsafe
keyword, by having two crates export a#[no_mangle] pub fn
with different signatures but compatible calling conventions:Despite the stated call to
two::convert
, it's actuallyone::convert
that gets called, which interprets the argument as a&'static i32
. (It may be clearer to understand with this simpler example, which doesn't break type safety.)On at least GNU/Linux but not other platforms like Windows or Darwin, dynamically-linked symbols have the same ambiguity.
I don't know what the right response is here. The following options all seem pretty reasonable:
unsafe
keyword.#[no_mangle]
export both a mangled and un-mangled name, and have Rust crates call each other via mangled names only, on the grounds that#[no_mangle]
is for external interfaces, not for crates linking to crates. ("External interfaces" includes other Rust code using FFI, but FFI is unsafe.) This is analogous to howextern "C" fn
s export both a Rust-ABI symbol as well as a C-ABI shim, and a direct, safe Rust call to those function happens through the Rust ABI, not through the C ABI. I'm pretty sure that all production uses of#[no_mangle]
areextern "C"
, anyway (see e.g. Can't define unsafe extern "C" fn #10025).#[no_mangle]
on safe functions and data, and introduce a new#[unsafe_no_mangle]
, so it substring-matchesunsafe
. (#[no_mangle]
on unsafe functions or mutable statics is fine, since you need theunsafe
keyword to get at them.)All of these are, I think, backwards-compatible.
The text was updated successfully, but these errors were encountered: