-
Notifications
You must be signed in to change notification settings - Fork 12.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Compilation error when matching reference to empty enum #131452
Comments
For more background (but partially outdated), also see this blog post by @nikomatsakis. Cc @rust-lang/opsem @rust-lang/types @Nadrieril My comment referred to the fact that the safety invariant is uncontroversial in your example, and sufficient to ensure that without unsafe code, nothing weird can happen -- in the specific case considered here, where the match is on a reference. There are related, more subtle cases though. For instance, what if Also for references, we may allow code to temporarily break the So my position on this heavily depends on whether the place we are matching on is "safe" or not. A place is "unsafe" if it involves For safe places, I am much less concerned. Maybe we want an opt-in lint that recommends never patterns to make this explicit. Maybe we want the lint to be warn-by-default when there is unsafe code nearby? But that seems hard and we don't have any other lint like this. There's also the question of irrefutable patterns exploiting empty types, such as |
Can't we make a distinction between unsafe and safe places and auto-scrutinise the discriminant of safe places? IOW declare that |
Yes, that is basically what I am suggesting. It is a new distinction between two kinds of places that I don't think we already have, so it's not an entirely trivial extension of the status quo.
|
Well, since |
+1 to most of what Ralf just said. I'm in favor of making the
As far as I remember we hadn't yet committed to that, at least in more complex cases like
We can suggest |
Note that from a user perspective, this issue happens quickly when one wants to implement the "Trees that Grow" compiler AST pattern in Rust: enum Never {}
trait Phase {
type PhaseData;
}
struct PhaseWithInhabited;
impl Phase for PhaseWithInhabited {
type PhaseData = Never;
}
enum Ast<P: Phase> {
Identifier(&'static str),
MaybeSomething(P::PhaseData)
}
fn main() {
let ast = Ast::<PhaseWithInhabited>::Identifier("hi");
let s = match &ast {
Ast::Identifier(s) => s
};
} Tada:
Note, as a user, I'll be happy if the compiler would fill the inhabited pattern with |
With rust-lang/rfcs#3719, you could write this as let s = match &ast {
Ast::Identifier(s) => s
! // explicitly mark other cases as impossible
}; |
That would of course be better than the status quo! It would still be a bit unexpected for new users that match by value and match by reference have different semantics regarding inhabited cases, but if there is a good reason, I guess everyone can live with that! |
For matching on references I'd say this is deep in the undecided territory. For matching on |
Yes, my concern is only about references, as they are very common in safe code (as in my AST example). For raw pointers and unsafe code, I have no opinion, as people writing unsafe code are (hopefully) typically way more advanced into Rust than people writing safe code, so they could certainly bear the increased cognitive load of a more complex mental model. |
The explicit never patterns RFC is exciting 👍 I just wanted to quickly explore (below) how it might relate to satisfying the discussed safe-rust use cases benefit from stabilisation of explicit never patterns, without an auto-never sugaring (or something equivalent such as exhaustive-patterns). To summarize in the introduction, never patterns are mostly intended for use in unsafe code, but they won't by themselves allow "safe land" to feel seamless without something like the auto-never rules or the Latest exampleIn this latest example (playground link), according to the incomplete let _s = match &ast {
Ast::Identifier(s) => s,
Ast::MaybeSomething(!),
}; Although I think there is discussion on the current RFC about possibly loosening that? I've posted a comment on the RFC clarifying. Macro-generated nested enum exampleLooking at the nested enum example from the
Side note - newcomer introduction to the complexitiesI wanted to share my own (ineloquent - and hopefully not too inaccurate) explanation of some of the complexities on display here inspired by this great one by Nadrieril to attempt to help future me and other newcomers come up to speed on the complexities of this: In "safe land" dealing with references, the verbosity of never patterns feel unergonomic because references to uninhabited variants cannot be constructed in safe code (they don't satisfy the "safety" invariant of a type which can be assumed in safe code). So, in safe code, the match patterns feel like boilerplate to satisfy the compiler. The code compiling with a lack of patterns would communicate the same meaning as explicit never patterns more succinctly. But this isn't quite true in "unsafe land":
|
That sounds pretty good. :) Personally I am undecided at how far we want to go with "automatic never patterns" in safe code. But I can see how having to write |
In the case you mentioned, and in interaction with unsafe code, yes. But when using never types to make some enum variants vanish, which is the whole idea of the "Tree that Grow" mentioned above (in that pattern one could have several compiler phases and more than one branch that vanish in each phase), that is a maintenance hindrance. So I guess there are tradeoffs between various use cases. |
In my mind, there's a compelling argument in terms of user impact - I imagine ~2 orders of magnitude more people are writing matches or destructurings with partially uninhabited types / references (e.g. infallible results) than matching on pointers to partially valid objects. I generally like explicitness/awkwardness where it prompts the user with a necessary choice they need to handle / consider; but in safe-land code, never patterns just give the awkwardness, and don't have any pay-off in terms of improved handling or understanding. From a rust project organisation perspective, do you think this is the right place to discuss what auto-never could look like? Or is this best left to a different channel / medium? |
Musings I have around a potential auto-never feature include:
|
I think it makes more sense to either make this part of the never pattern RFC, or a dedicated companion RFC, depending on what @Nadrieril prefers. |
Yes, sorry my post was unclear. The clarity benefit was supposed to just be a benefit in For what it's worth I agree with you - my personal view is that in safe code, requiring explicit never patterns would result in slightly lower clarity, because the boilerplate is irrelevant, and is therefore a distraction from actual code/logic that the reader might need to care about. |
Quite so. Additionally, it sets macro authors up for failure. Every macro-generated |
I agree with all of this. I would prefer for this to be a separate discussion from never patterns: never patterns are useful on their own, and once we have them we can choose how much to elide them. I think this here issue is a good place to discuss this, with the option of going to the rust-lang Zulip to discuss finer points. |
I'm removing this comment under the acknowledgement that it's not helpful, but also not using the built-in hide feature because I feel like it would be confusing for understanding the existing discussion. It's preserved here for posterity.Forgive me for not fully understanding the circumstances, but what's confusing me is this:
Even inside a Like, I get that this may not be how the compiler works in reality (still treating references as pointers, and thus having real pointers to nonexistent types even if there's no way to actually form them) but that just feels like the proper takeaway from this. I also checked to make sure I'm not being incorrect, and the following code requires enum Void {}
const VOID: *const Void = core::ptr::dangling();
fn main() { match *VOID {} } |
There's no reference being created here, so I can't follow what you are saying, sorry.
|
I think there's possibly a bit of confusion in your message between e.g. pointers and references; and the difference between unsafe and UB. If the deep complexities of these topics are new to you (as they were to me a month or two ago - and I'm still coming up the curve), I'd suggest the following reading:
In terms of how it relates to this thread - a lot of this detail is thankfully irrelevant in safe-land code dealing with references (assuming unsafe-land code maintains its safety invariants on its interfaces!) so an ideal solution would have safe-land code not care about this complexity, and unsafe-land code have intuitive tools to handle this. Roughly speaking, the current direction looks to be:
The right order of tackling these appears a bit unclear, as any separate discussion ends up becoming a bit tangled up. But I'm sure Nadrieril will update us when they have a plan. |
I'm removing this comment under the acknowledgement that it's not helpful, but also not using the built-in hide feature because I feel like it would be confusing for understanding the existing discussion. It's preserved here for posterity.I guess that my source of confusion here is still that I was under the impression that coercing a pointer into a reference comes with a few contracts that must be upheld, on pain of UB. Even though For example, I was under the assumption that simply creating a reference from an unaligned pointer is UB whether you read that reference or not. I figured that creating a reference from an uninhabited type, which literally cannot exist, also fell into this situation. But I guess my confusion here is that there's a case where unsafe code can both create an uninhabited reference and would want to avoid treating this immediately as UB, which feels confusing. Reading the 2018 post, it makes a little more sense, but I'm still not entirely convinced. The specific case concerns unions, particularly Like, I'm fine with the idea of creating references not being the source of UB, and instead accessing them. All this is fine. What's particularly bothersome to me is the implication that Like, don't get me wrong, I love never patterns as an explanation tool, but I'm still not convinced they should be added as a language syntax, because the positives don't really outweigh the negatives. Like, you already need to consider several non-obvious aspects of types in unsafe code, liki ZSTs for example, and I'm not sure that uninhabitedness assertions are weird enough that they deserve gating reference patterns behind an entirely new, custom syntax for it. Also rereading this, it's clear I flipped between being against and for access-based UB in my explanation, which admittedly was one of the sources of confusion before. But I'm not convinced that this deserves blocking basic ergonomics behind a special syntax that will complicate the fundamentals of match blocks just because it might be unintuitive that a match accesses a place and can thus invoke UB if it's invalid. |
That is correct. It is also entirely irrelevant in your example since that example does not involve any references.
No, that is not the case. Unsafe code can create a place to an uninhabited type, or an unaligned place. This is already the case today. The following code is well-defined: let ptr = &raw const *(23 as *const i32);
let ptr = &raw const *(23 as *const !); But places and references are very different things. Please carefully ready the resources that were recommended to you above, in particular my recent blog post about places. You keep talking about references. Please explain why, since as I already said above, it really doesn't make sense in the context of this example, where there are no references involved: enum Void {}
const VOID: *const Void = core::ptr::dangling();
fn main() { match *VOID {} } |
I'm confused: the whole point of
There's nothing to be for or against, at least in the context of this issue: the kind of "access-based" that we're talking about is a core fact of rust unsafe semantics, it's not up for debate here. It seems to me that you're not familiar with the specific operational semantics context that's assumed when talking about this issue. This is not the place to explain this context, would you mind continuing this conversation on Zulip? |
I have a particular aversion to using Zulip for discussion for a number of reasons, although I will stop discussing this here since you're right that this isn't the place. However, I will add that I think that providing the operational semantics context specifically for this issue is an important piece of justifying why this particular change is blocked on some kind of never patterns, and I don't think that this issue nor the proposed RFC do a good enough job of explaining that. A collection of blog posts, while very helpful, does not constitute a sufficient motivation either: it should be combined together in a single explanation and I would like to see that. Also, FWIW, in hindsight I don't think that either of my explanations was particularly good, so, I'm editing them to make that clear while preserving them since they're otherwise important for understanding the discussion that happened here. |
Note
Useful background links which have been shared in comments:
min_exhaustive_patterns
#119612 (comment) was stabilized a couple of months ago.exhaustive_patterns
feature #51085 feature, particularly this post by Nadriedel summarising the subtleties.Summary
I tried this code:
I expected to see this happen: This compiles successfully.
Instead, this happened: (E0004 compiler error, expandable below)
E0004 Compiler Error
Searching for "Rust E0004" links to docs that don't explain why references to uninhabited types need to be matched: https://doc.rust-lang.org/error_codes/E0004.html - you have to search for "references are always considered inhabited" which takes you to this issue from 2020 - more on this history in the Background section below.
Motivating example
This comes up commonly when creating macros which generate enums, e.g. a very simplified example (playground link):
Compiler Error
The fact that this doesn't work for empty enums is quite a gotcha, and I've seen this issue arise a few times as an edge case. In most cases, it wasn't caught until a few months after, when someone uses the macro to create an enum with no variants.
But why would we ever use such a macro to generate empty enums? Well this can fall out naturally when generating a hierarchy of enums, where some inner enums are empty, e.g.
Workarounds
Various workarounds include:
_ => unreachable!("Workaround for empty enums: references to uninhabited types are considered inhabited at present")
match *self
match *self {}
match self { ! }
in the body as suggested by Ralf which is non-stable.Background
This was previously raised in this issue: #78123 but was closed as "expected behaviour" - due to the fact that:
However, when I raised this as a motivating example in rust-lang/unsafe-code-guidelines#413 (comment), @RalfJung suggested I raise a new rustc issue for this, and that actually the
match
behaviour is independent of the UB semantics decision:Meta
rustc --version --verbose
:Also works on nightly.
The text was updated successfully, but these errors were encountered: