Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

panic runtime and C-unwind documentation #1226

Open
wants to merge 9 commits into
base: master
Choose a base branch
from
2 changes: 2 additions & 0 deletions src/SUMMARY.md
Original file line number Diff line number Diff line change
Expand Up @@ -114,6 +114,8 @@
- [Memory allocation and lifetime](memory-allocation-and-lifetime.md)
- [Variables](variables.md)

- [Panic](panic.md)

- [Linkage](linkage.md)

- [Inline assembly](inline-assembly.md)
Expand Down
16 changes: 15 additions & 1 deletion src/behavior-considered-undefined.md
Original file line number Diff line number Diff line change
Expand Up @@ -78,7 +78,9 @@ r[undefined.target-feature]
does not support (see [`target_feature`]), *except* if the platform explicitly documents this to be safe.

r[undefined.call]
* Calling a function with the wrong call ABI or unwinding from a function with the wrong unwind ABI.
* Calling a function with the wrong [call ABI][abi], or unwinding past a stack
frame that does not allow unwinding (e.g. by calling a `"C-unwind"` function
imported or transmuted as a `"C"` function or function pointer).

r[undefined.invalid]
* Producing an [invalid value][invalid-values]. "Producing" a
Expand All @@ -97,6 +99,16 @@ r[undefined.const-transmute-ptr2int]
'Reinterpreting' refers to loading the pointer value at integer type without a
cast, e.g. by doing raw pointer casts or using a union.

r[undefined.runtime]
* Violating assumptions of the Rust runtime. This is only possible using
mechanisms outside Rust. Most assumptions of the Rust runtime are currently
not explicitly documented.
* For assumptions specifically related to unwinding, see the [panic
documentation][unwinding-ffi].
* The runtime assumes that a Rust stack frame is not deallocated without
executing destructors for local variables owned by the stack frame. This assumption
can be violated by C functions like `longjmp`.

> **Note**: Undefined behavior affects the entire program. For example, calling
> a function in C that exhibits undefined behavior of C means your entire
> program contains undefined behaviour that can also affect the Rust code. And
Expand Down Expand Up @@ -246,6 +258,7 @@ reading uninitialized memory is permitted are inside `union`s and in "padding"
[`const`]: items/constant-items.md
[noalias]: http://llvm.org/docs/LangRef.html#noalias
[pointer aliasing rules]: http://llvm.org/docs/LangRef.html#pointer-aliasing-rules
[abi]: abi.md
[undef]: http://llvm.org/docs/LangRef.html#undefined-values
[`target_feature`]: attributes/codegen.md#the-target_feature-attribute
[`UnsafeCell<U>`]: std::cell::UnsafeCell
Expand All @@ -259,5 +272,6 @@ reading uninitialized memory is permitted are inside `union`s and in "padding"
[project-field]: expressions/field-expr.md
[project-tuple]: expressions/tuple-expr.md#tuple-indexing-expressions
[project-slice]: expressions/array-expr.md#array-and-slice-indexing-expressions
[unwinding-ffi]: panic.md#unwinding-across-ffi-boundaries
[const-promoted]: destructors.md#constant-promotion
[lifetime-extended]: destructors.md#temporary-lifetime-extension
14 changes: 14 additions & 0 deletions src/crates-and-source-files.md
Original file line number Diff line number Diff line change
Expand Up @@ -124,6 +124,19 @@ use foo::bar as main;
<!-- If the previous section needs updating (from "must take no arguments"
onwards, also update it in the testing.md file -->

### Uncaught foreign unwinding

r[crate.uncaught-foreign-unwinding]

When a "foreign" unwind (e.g. an exception thrown from C++ code, or a `panic!`
in Rust code compiled or linked with a different runtime) is not caught before
reaching the `main` function, the process will be safely terminated. This may
take the form of an abort, in which case it is not guaranteed that any `Drop`
calls will be executed, and the error output may be less informative than if the
runtime had been terminated by a "native" Rust `panic`.

For more information, see the [panic documentation][panic-docs].

### The `no_main` attribute

r[crate.no_main]
Expand Down Expand Up @@ -169,6 +182,7 @@ or `_` (U+005F) characters.
[function]: items/functions.md
[module]: items/modules.md
[module path]: paths.md
[panic-docs]: panic.md#unwinding-across-ffi-boundaries
[shebang]: input-format.md#shebang-removal
[trait or lifetime bounds]: trait-bounds.md
[where clauses]: items/generics.md#where-clauses
Expand Down
25 changes: 24 additions & 1 deletion src/destructors.md
Original file line number Diff line number Diff line change
Expand Up @@ -268,7 +268,8 @@ Temporaries are also created to hold the result of operands to an expression
while the other operands are evaluated. The temporaries are associated to the
scope of the expression with that operand. Since the temporaries are moved from
once the expression is evaluated, dropping them has no effect unless one of the
operands to an expression breaks out of the expression, returns, or panics.
operands to an expression breaks out of the expression, returns, or
[panics][panic].

```rust
# struct PrintOnDrop(&'static str);
Expand Down Expand Up @@ -419,6 +420,8 @@ let x = (&temp()).use_temp(); // ERROR

## Not running destructors

### `forget`

r[destructors.forget]

[`std::mem::forget`] can be used to prevent the destructor of a variable from being run,
Expand All @@ -428,6 +431,22 @@ variable or field from being dropped automatically.
> Note: Preventing a destructor from being run via [`std::mem::forget`] or other means is safe even if it has a type that isn't `'static`.
> Besides the places where destructors are guaranteed to run as defined by this document, types may *not* safely rely on a destructor being run for soundness.

### Process termination without unwinding

r[destructors.process-termination]

There are some ways to terminate the process without [unwinding], in which case
destructors will not be run.

The standard library provides [`std::process::exit`] and
[`std::process::abort`] to do this explicitly. Additionally, if the
[panic-mode] is set to `abort`, panicking will always terminate the process
without destructors being run.

There is one additional case to be aware of: when a panic reaches a
[non-unwinding ABI boundary], either no destructors will run, or all
destructors up until the ABI boundary will run.

[Assignment]: expressions/operator-expr.md#assignment-expressions
[binding modes]: patterns.md#binding-modes
[closure]: types/closure.md
Expand All @@ -437,11 +456,15 @@ variable or field from being dropped automatically.
[initialized]: glossary.md#initialized
[interior mutability]: interior-mutability.md
[lazy boolean expression]: expressions/operator-expr.md#lazy-boolean-operators
[non-unwinding ABI boundary]: items/functions.md#unwinding
[panic]: panic.md
[panic-mode]: panic.md#panic-runtimes
[place context]: expressions.md#place-expressions-and-value-expressions
[promoted]: destructors.md#constant-promotion
[scrutinee]: glossary.md#scrutinee
[statement]: statements.md
[temporary]: expressions.md#temporaries
[unwinding]: panic.md#unwinding
[variable]: variables.md

[array]: types/array.md
Expand Down
3 changes: 2 additions & 1 deletion src/expressions/array-expr.md
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,7 @@ Just as with methods, Rust will also insert dereference operations on `a` repeat

Indices are zero-based for arrays and slices.
Array access is a [constant expression], so bounds can be checked at compile-time with a constant index value.
Otherwise a check will be performed at run-time that will put the thread in a _panicked state_ if it fails.
Otherwise a check will be performed at run-time that will put the thread in a [_panicked state_][panic] if it fails.

```rust,should_panic
// lint is deny by default.
Expand Down Expand Up @@ -84,5 +84,6 @@ The array index expression can be implemented for types other than arrays and sl
[constant item]: ../items/constant-items.md
[literal]: ../tokens.md#literals
[memory location]: ../expressions.md#place-expressions-and-value-expressions
[panic]: ../panic.md
[path]: path-expr.md
[slice]: ../types/slice.md
34 changes: 26 additions & 8 deletions src/items/external-blocks.md
Original file line number Diff line number Diff line change
Expand Up @@ -102,7 +102,7 @@ unsafe extern "stdcall" { }
```

r[items.extern.abi.standard]
There are three ABI strings which are cross-platform, and which all compilers
There are five ABI strings which are cross-platform, and which all compilers
are guaranteed to support:

r[items.extern.abi.rust]
Expand All @@ -118,6 +118,11 @@ r[items.extern.abi.system]
which case it's `"stdcall"`, or what you should use to link to the Windows
API itself

r[items.extern.abi.unwind]
* `extern "C-unwind"` and `extern "system-unwind"` -- identical to `"C"` and
`"system"`, respectively, but with [different behavior][unwind-behavior] when
the callee unwinds (by panicking or throwing a C++ style exception).

r[items.extern.abi.platform]
There are also some platform-specific ABI strings:

Expand Down Expand Up @@ -151,6 +156,18 @@ r[items.extern.abi.thiscall]
r[items.extern.abi.efiapi]
* `unsafe extern "efiapi"` -- The ABI used for [UEFI] functions.

Like `"C"` and `"system"`, most platform-specific ABI strings also have a
[corresponding `-unwind` variant][unwind-behavior]; specifically, these are:

* `"cdecl-unwind"`
* `"stdcall-unwind"`
* `"fastcall-unwind"`
* `"vectorcall-unwind"`
* `"thiscall-unwind"`
* `"aapcs-unwind"`
* `"win64-unwind"`
* `"sysv64-unwind"`

## Variadic functions

r[items.extern.variadic]
Expand Down Expand Up @@ -439,10 +456,9 @@ Attributes on extern function parameters follow the same rules and
restrictions as [regular function parameters].

[IDENTIFIER]: ../identifiers.md
[PE Format]: https://learn.microsoft.com/windows/win32/debug/pe-format#import-name-type
[UEFI]: https://uefi.org/specifications
[WebAssembly module]: https://webassembly.github.io/spec/core/syntax/modules.html
[functions]: functions.md
[statics]: static-items.md
[_Abi_]: functions.md
[_Function_]: functions.md
[_InnerAttribute_]: ../attributes.md
Expand All @@ -452,11 +468,13 @@ restrictions as [regular function parameters].
[_OuterAttribute_]: ../attributes.md
[_StaticItem_]: static-items.md
[_Visibility_]: ../visibility-and-privacy.md
[attributes]: ../attributes.md
[regular function parameters]: functions.md#attributes-on-function-parameters
[`bundle` documentation for rustc]: ../../rustc/command-line-arguments.html#linking-modifiers-bundle
[`whole-archive` documentation for rustc]: ../../rustc/command-line-arguments.html#linking-modifiers-whole-archive
[`verbatim` documentation for rustc]: ../../rustc/command-line-arguments.html#linking-modifiers-verbatim
[`dylib` versus `raw-dylib`]: #dylib-versus-raw-dylib
[PE Format]: https://learn.microsoft.com/windows/win32/debug/pe-format#import-name-type
[`verbatim` documentation for rustc]: ../../rustc/command-line-arguments.html#linking-modifiers-verbatim
[`whole-archive` documentation for rustc]: ../../rustc/command-line-arguments.html#linking-modifiers-whole-archive
[attributes]: ../attributes.md
[functions]: functions.md
[regular function parameters]: functions.md#attributes-on-function-parameters
[statics]: static-items.md
[unwind-behavior]: functions.md#unwinding
[value namespace]: ../names/namespaces.md
56 changes: 51 additions & 5 deletions src/items/functions.md
Original file line number Diff line number Diff line change
Expand Up @@ -254,13 +254,59 @@ extern "C" fn new_i32() -> i32 { 0 }
let fptr: extern "C" fn() -> i32 = new_i32;
```

### Unwinding

r[items.fn.extern.unwind]
Functions with an ABI that differs from `"Rust"` do not support unwinding in the
exact same way that Rust does. Therefore, unwinding past the end of functions
with such ABIs causes the process to abort.

> **Note**: The LLVM backend of the `rustc` implementation
aborts the process by executing an illegal instruction.
r[items.fn.extern.unwind.into]
Most ABI strings come in two variants, one with an `-unwind` suffix and one without.
The `Rust` ABI always permits unwinding, so there is no `Rust-unwind` ABI. The
choice of ABI, together with the runtime [panic mode][panic-modes], determines
the behavior when unwinding out of a function.

The table below indicates the behavior of an unwinding operation reaching each
type of ABI boundary (function declaration or definition using the
corresponding ABI string). Note that the Rust runtime is not affected by, and
cannot have an effect on, any unwinding that occurs entirely within another
language's runtime, that is, unwinds that are thrown and caught without
reaching a Rust ABI boundary.

The `panic`-unwind column refers to [panicking] via the `panic!` macro and
similar standard library mechanisms, as well as to any other Rust operations
that cause a panic, such as out-of-bounds array indexing or integer overflow.

The "unwinding" ABI category refers to `"Rust"` (the implicit ABI of Rust
functions not marked `extern`), `"C-unwind"`, and any other ABI with `-unwind`
in its name. The "non-unwinding" ABI category refers to all other ABI strings,
including `"C"` and `"stdcall"`.

Native unwinding is defined per-target. On targets that support throwing and
catching C++ exceptions, it refers to the mechanism used to implement this
feature. Some platforms implement a form of unwinding referred to as ["forced
unwinding"][forced-unwinding]; `longjmp` on Windows and `pthread_exit` in
`glibc` are implemented this way. Forced unwinding is explicitly excluded
from the "Native unwind" column in the table.

| panic runtime | ABI | `panic`-unwind | Native unwind (unforced) |
| -------------- | ------------ | ------------------------------------- | ----------------------- |
| `panic=unwind` | unwinding | unwind | unwind |
| `panic=unwind` | non-unwinding | abort (see notes below) | [undefined behavior] |
| `panic=abort` | unwinding | `panic` aborts without unwinding | abort |
| `panic=abort` | non-unwinding | `panic` aborts without unwinding | [undefined behavior] |

> **Note**: The following guarantee applies from Rust 1.82 onward: with
> `panic=unwind`, when a `panic` is turned into an abort by a non-unwinding
> ABI boundary, either no destructors (`Drop` calls) will run, or all destructors
> up until the ABI boundary will run.
chorman0773 marked this conversation as resolved.
Show resolved Hide resolved

For other considerations and limitations regarding unwinding across FFI
boundaries, see the [relevant section in the Panic documentation][panic-ffi].

[forced-unwinding]: https://rust-lang.github.io/rfcs/2945-c-unwind-abi.html#forced-unwinding
[panic-modes]: ../panic.md#panic-runtimes
[panic-ffi]: ../panic.md#unwinding-across-ffi-boundaries
[panicking]: ../panic.md
[undefined behavior]: ../behavior-considered-undefined.md

BatmanAoD marked this conversation as resolved.
Show resolved Hide resolved
## Const functions

Expand Down
39 changes: 39 additions & 0 deletions src/linkage.md
Original file line number Diff line number Diff line change
Expand Up @@ -254,6 +254,9 @@ RUSTFLAGS='-C target-feature=+crt-static' cargo build --target x86_64-pc-windows

## Mixed Rust and foreign codebases

r[link.foreign-code]

r[link.foreign-code.foreign-linkers]
If you are mixing Rust with foreign code (e.g. C, C++) and wish to make a single
binary containing both types of code, you have two approaches for the final
binary link:
Expand All @@ -269,6 +272,42 @@ binary link:

Passing `rlib`s directly into your foreign linker is currently unsupported.

### Prohibited linkage and foreign unwinding

r[link.foreign-code.prohibited]

Undfined behavior may be caused by foreign code unwinding into a Rust crate
with these characteristics:

* The foreign unwind enters Rust via a function or function pointer declared
with an ABI that permits unwinding, that is, `"Rust"` (the default ABI) or
any `-unwind` ABI
* The Rust crate containing the `-unwind` ABI declaration was compiled with
`panic=unwind`
* The final binary is linked with [the `panic=abort` runtime][panic-runtime]
Copy link
Member

@RalfJung RalfJung Nov 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You changed the meaning of the text by making this about the final binary. I am not sure whether this is correct. I'd prefer if you could just restore the old wording, which was AFAIK reviewed by a bunch of people, and leave further improvements to a later PR so that the diff over the previous consensus can be more easily reviewed.

Copy link
Member

@RalfJung RalfJung Nov 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In fact I am fairly sure your new wording is wrong. If the unwinding comes from an external crate, enters through a -Cpanic=unwind crate, and then propagates into a -Cpanic=abort crate, that is UB, even if the binary is -Cpanic=unwind.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We certainly talked about the original text a lot, but my impression was that you still considered it confusing. I started the change because it seemed related to the entirely-new "foreign linking" section, but seemed even more confusing in its original form when juxtaposed with the new section.

I believe the rule has always been about whether or not the "abort" runtime is linked, and since only one panic runtime can be linked in a final Rust binary, I don't understand how the introduction of the word "final" changes this. I can remove the word "final", though.

A panic=abort crate will have shims for aborting the process in the presence of unwinding at extern boundaries. If there's no extern boundary, then the two crates are compiled as source code so they can't have different panic modes in the first place.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there's no extern boundary, then the two crates are compiled as source code so they can't have different panic modes in the first place.

That's not right. You can link together different crates with different flags.

Copy link
Member

@RalfJung RalfJung Nov 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The old text was:

No crate may be linked with [the `panic=abort` runtime][panic-runtime] if it has
both of the following characteristics:
* It contains a call to an `-unwind` foreign function or function pointer
* It was compiled with `panic=unwind`

Actually looking at this I think there is a mistake here -- the problematic case is linking a crate with the "unwind" runtime, but compiling it with panic=abort, isn't it?

panic=abort sets a bunch of "nounwind" flags, and ultimately what this here is all about is preventing those from causing UB. So for a concrete example:

  • crate A is panic=unwind, calls some C-unwind function from a "Rust" ABI function
  • crate B is panic=abort, calls crate A and marks that call nounwind
  • now we have UB if that C-unwind call ever unwinds

It doesn't even matter which panic runtime is linked in, I think? This is about panics from the outside world after all.

The panic runtime would come in in a situation like this:

  • crate A is panic=abort, calls e.g. unwrap and marks that call with nounwind since it has the "Rust" ABI
  • the panic_unwind runtime is linked in
  • now we have UB if that unwrap ever fails

Not sure where we say that that is UB.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* The final binary is linked with [the `panic=abort` runtime][panic-runtime]
* The program unwinds into a rust crate compiled with `panic=abort`

Would this be correct?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes that sounds right. Cc @nbdd0121 to be sure.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But also, don't we have to say somewhere that if a crate is built with panic=abort, and later linked with the panic=unwind runtime, that's UB too?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my view this is resolved by my later commits, but someone should take a look.


> **Note**: To protect against this undefined behavior, `rustc` does not permit
> linking the `panic=abort` runtime against any crate that was compiled with
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment also likely needs updating, since it's not about the panic=abort runtime, it's about mixing panic=abort and panic=unwind crates irrespective of which runtime is ultimately linked in.

> `panic=unwind` if that crate also contains a call to a foreign function or
> function pointer declared with an `-unwind` ABI. Note that this prohibition
> applies even when linking a static or dynamic library that only includes Rust
> code, since the resulting library may be subsequently linked against another
> library that may unwind. However, use of the `Rust` (default) ABI does not
> cause a link-error, since that ABI is not expected to be used as an
> entrypoint into a static or shared library.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't get the last sentence.

Also, this should explain the gap in this rustc check. Clearly there's a gap, otherwise we wouldn't have to document this UB.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The UB is only possible when using a linker other than rustc (a "foreign linker" as described in the section above), so AFAIK there is no gap in rustc itself.

This sentence explains why the Rust ABI is not included in the lint, which is what you were originally asking about before I changed the verbiage.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, that's existing text so I missed it. It should probably mention dlopen-like situations? Even if not mixing Rust an non-Rust code, one may be using a non-rustc linker.

The entire section should then probably say:

If you are using a foreign linker, the following requirements must be upheld: [...]

And then the note can just say that when using rustc as a linker, those requirements are enforced automatically.

This sentence explains why the Rust ABI is not included in the lint, which is what you were originally asking about before I changed the verbiage.

This section isn't about the lint though? It's about a hard error on panic option mismatches. That one can't rely on what is "expected" to happen, it should be sound. Did the sentence end up in the wrong paragraph?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That will have to wait for @BatmanAoD then.

Copy link
Member

@RalfJung RalfJung Nov 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have pushed proposed wording to https://github.com/RalfJung/reference/commits/c-unwind-documentation/. Unfortunately I can't push to this PR.
(EDIT: I deleted that branch and added the commits in this PR as @BatmanAoD gave me access, thanks a lot. :)

I sure hope what that text says is correct, but it would be good if @BatmanAoD or @nbdd0121 could take a look. :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have time right now to look, but I've added both you and Gary Guo as collaborators on my fork. I do hope to get confirmation from him because he understands this edge condition better than I do, I think.

r[link.foreign-code.prohibited.lint.ffi_unwind_calls]
To guarantee that a library will be sound (and linkable with `rustc`)
regardless of the panic mode used at link-time, the [`ffi_unwind_calls` lint]
may be used. The lint flags any calls to `-unwind` foreign functions or
function pointers.

> **Note**: the restriction can only be violated when mixing code with different
> `-C panic` flags. This is not possible in normal use of cargo, so most users
> need not be concerned about this.
chorman0773 marked this conversation as resolved.
Show resolved Hide resolved
[`cfg` attribute `target_feature` option]: conditional-compilation.md#target_feature
[`ffi_unwind_calls` lint]: ../rustc/lints/listing/allowed-by-default.html#ffi-unwind-calls
[configuration option]: conditional-compilation.md
[panic-runtime]: panic.md#panic-runtimes
[procedural macros]: procedural-macros.md
Loading