From 76da7731c8f24618843f30ef23e54cb72c2b3a39 Mon Sep 17 00:00:00 2001 From: Connor Horman Date: Wed, 28 Aug 2024 22:47:32 -0400 Subject: [PATCH 1/4] Add identifier syntax to keywords, lifetime-elision, macros, memory-allocation-and-lifetimes, names, and paths --- src/keywords.md | 27 ++++++- src/lifetime-elision.md | 48 ++++++++++-- src/macros.md | 22 ++++++ src/memory-allocation-and-lifetime.md | 4 + src/names.md | 53 ++++++++++++- src/paths.md | 106 ++++++++++++++++++++++---- 6 files changed, 237 insertions(+), 23 deletions(-) diff --git a/src/keywords.md b/src/keywords.md index 300922caf..e5bb2e76a 100644 --- a/src/keywords.md +++ b/src/keywords.md @@ -1,5 +1,7 @@ # Keywords +r[lex.keywords] + Rust divides keywords into three categories: * [strict](#strict-keywords) @@ -8,6 +10,9 @@ Rust divides keywords into three categories: ## Strict keywords +r[lex.keywords.strict] + +r[lex.keywords.strict.intro] These keywords can only be used in their correct contexts. They cannot be used as the names of: @@ -20,7 +25,8 @@ be used as the names of: * [Macro placeholders] * [Crates] -> **Lexer:**\ +r[lex.keywords.strict.list] +> **Lexer:**\ > KW_AS : `as`\ > KW_BREAK : `break`\ > KW_CONST : `const`\ @@ -57,6 +63,7 @@ be used as the names of: > KW_WHERE : `where`\ > KW_WHILE : `while` +r[lex.keywords.strict.edition2018] The following keywords were added beginning in the 2018 edition. > **Lexer 2018+**\ @@ -66,11 +73,15 @@ The following keywords were added beginning in the 2018 edition. ## Reserved keywords +r[lex.keywords.reserved] + +r[lex.keywords.reserved.intro] These keywords aren't used yet, but they are reserved for future use. They have the same restrictions as strict keywords. The reasoning behind this is to make current programs forward compatible with future versions of Rust by forbidding them to use these keywords. +r[lex.keywords.reserved.list] > **Lexer**\ > KW_ABSTRACT : `abstract`\ > KW_BECOME : `become`\ @@ -85,6 +96,7 @@ them to use these keywords. > KW_VIRTUAL : `virtual`\ > KW_YIELD : `yield` +r[lex.keywords.reserved.edition2018] The following keywords are reserved beginning in the 2018 edition. > **Lexer 2018+**\ @@ -92,12 +104,20 @@ The following keywords are reserved beginning in the 2018 edition. ## Weak keywords +r[lex.keywords.weak] + +r[lex.keywords.weak.intro] These keywords have special meaning only in certain contexts. For example, it is possible to declare a variable or method with the name `union`. +r[lex.keywords.weak.macro_rules] * `macro_rules` is used to create custom [macros]. + +r[lex.keywords.weak.union] * `union` is used to declare a [union] and is only a keyword when used in a union declaration. + +r[lex.keywords.weak.lifetime-static] * `'static` is used for the static lifetime and cannot be used as a [generic lifetime parameter] or [loop label] @@ -105,12 +125,15 @@ is possible to declare a variable or method with the name `union`. // error[E0262]: invalid lifetime parameter name: `'static` fn invalid_lifetime_parameter<'static>(s: &'static str) -> &'static str { s } ``` + +r[lex.keywords.weak.dyn] * In the 2015 edition, [`dyn`] is a keyword when used in a type position followed by a path that does not start with `::` or `<`, a lifetime, a question mark, a `for` keyword or an opening parenthesis. Beginning in the 2018 edition, `dyn` has been promoted to a strict keyword. +r[lex.keywords.weak.list] > **Lexer**\ > KW_MACRO_RULES : `macro_rules`\ > KW_UNION : `union`\ @@ -118,6 +141,8 @@ is possible to declare a variable or method with the name `union`. > > **Lexer 2015**\ > KW_DYN : `dyn` + +r[lex.keywords.weak.safe] * `safe` is used for functions and statics, which has meaning in [external blocks]. [items]: items.md diff --git a/src/lifetime-elision.md b/src/lifetime-elision.md index f509f3fee..70916c4e4 100644 --- a/src/lifetime-elision.md +++ b/src/lifetime-elision.md @@ -1,23 +1,38 @@ # Lifetime elision +r[lifetime-elision] + Rust has rules that allow lifetimes to be elided in various places where the compiler can infer a sensible default choice. ## Lifetime elision in functions +r[lifetime-elision.function] + +r[lifetime-elision.function.intro] In order to make common patterns more ergonomic, lifetime arguments can be *elided* in [function item], [function pointer], and [closure trait] signatures. The following rules are used to infer lifetime parameters for elided lifetimes. -It is an error to elide lifetime parameters that cannot be inferred. The -placeholder lifetime, `'_`, can also be used to have a lifetime inferred in the -same way. For lifetimes in paths, using `'_` is preferred. Trait object -lifetimes follow different rules discussed + +r[lifetime-elision.function.constraint] +It is an error to elide lifetime parameters that cannot be inferred. + +r[lifetime-elision.function.explicit-placeholder] +The placeholder lifetime, `'_`, can also be used to have a lifetime inferred in the +same way. For lifetimes in paths, using `'_` is preferred. + +r[lifetime-elision.function.only-functions] +Trait object lifetimes follow different rules discussed [below](#default-trait-object-lifetimes). +r[lifetime-elision.function.implicit-lifetime-parameters] * Each elided lifetime in the parameters becomes a distinct lifetime parameter. + +r[lifetime-elision.function.output-lifetime] * If there is exactly one lifetime used in the parameters (elided or not), that lifetime is assigned to *all* elided output lifetimes. +r[lifetime-elision.function.reciever-lifetime] In method signatures there is another rule * If the receiver has type `&Self` or `&mut Self`, then the lifetime of that @@ -75,27 +90,43 @@ fn frob(s: &str, t: &str) -> &str; // ILLEGAL ## Default trait object lifetimes +r[lifetime-elision.trait-object] + +r[lifetime-elision.trait-object.intro] The assumed lifetime of references held by a [trait object] is called its _default object lifetime bound_. These were defined in [RFC 599] and amended in [RFC 1156]. +r[lifetime-elision.trait-object.explicit-bound] These default object lifetime bounds are used instead of the lifetime parameter -elision rules defined above when the lifetime bound is omitted entirely. If -`'_` is used as the lifetime bound then the bound follows the usual elision +elision rules defined above when the lifetime bound is omitted entirely. + +r[lifetime-elision.trait-object.explicit-placeholder] +If `'_` is used as the lifetime bound then the bound follows the usual elision rules. +r[lifetime-elision.trait-object.containing-type] If the trait object is used as a type argument of a generic type then the containing type is first used to try to infer a bound. +r[lifetime-elision.trait-object.containing-type-unique] * If there is a unique bound from the containing type then that is the default + +r[lifetime-elision.trait-object.containing-type-explicit] * If there is more than one bound from the containing type then an explicit bound must be specified +r[lifetime-elision.trait-object.trait-bounds] If neither of those rules apply, then the bounds on the trait are used: +r[lifetime-elision.trait-object.trait-unique] * If the trait is defined with a single lifetime _bound_ then that bound is used. + +r[lifetime-elision.trait-object.static-lifetime] * If `'static` is used for any lifetime bound then `'static` is used. + +r[lifetime-elision.trait-object.default] * If the trait has no lifetime bounds, then the lifetime is inferred in expressions and is `'static` outside of expressions. @@ -133,6 +164,7 @@ type T7<'a, 'b> = TwoBounds<'a, 'b, dyn Foo>; // Error: the lifetime bound for this object type cannot be deduced from context ``` +r[lifetime-elision.trait-object.innermost-type] Note that the innermost object sets the bound, so `&'a Box` is still `&'a Box`. @@ -151,6 +183,9 @@ impl<'a> dyn Bar<'a> + 'a {} ## `'static` lifetime elision +r[lifetime-elision.item] + +r[lifetime-elision.item.intro] Both [constant] and [static] declarations of reference types have *implicit* `'static` lifetimes unless an explicit lifetime is specified. As such, the constant declarations involving `'static` above may be written without the @@ -172,6 +207,7 @@ const BITS_N_STRINGS: BitsNStrings<'_> = BitsNStrings { }; ``` +r[lifetime-elision.item.fn-types] Note that if the `static` or `const` items include function or closure references, which themselves include references, the compiler will first try the standard elision rules. If it is unable to resolve the lifetimes by its diff --git a/src/macros.md b/src/macros.md index 719b9afbc..552e92013 100644 --- a/src/macros.md +++ b/src/macros.md @@ -1,17 +1,26 @@ # Macros +r[macro] + +r[macro.intro] The functionality and syntax of Rust can be extended with custom definitions called macros. They are given names, and invoked through a consistent syntax: `some_extension!(...)`. There are two ways to define new macros: +r[macro.rules] * [Macros by Example] define new syntax in a higher-level, declarative way. + +r[macro.proc] * [Procedural Macros] define function-like macros, custom derives, and custom attributes using functions that operate on input tokens. ## Macro Invocation +r[macro.invocation] + +r[macro.invocation.syntax] > **Syntax**\ > _MacroInvocation_ :\ >    [_SimplePath_] `!` _DelimTokenTree_ @@ -29,17 +38,30 @@ There are two ways to define new macros: >    | [_SimplePath_] `!` `[` _TokenTree_\* `]` `;`\ >    | [_SimplePath_] `!` `{` _TokenTree_\* `}` +r[macro.invocation.intro] A macro invocation expands a macro at compile time and replaces the invocation with the result of the macro. Macros may be invoked in the following situations: +r[macro.invocation.expr] * [Expressions] and [statements] + +r[macro.invocation.pattern] * [Patterns] + +r[macro.invocation.type] * [Types] + +r[macro.invocation.item] * [Items] including [associated items] + +r[macro.invocation.nested] * [`macro_rules`] transcribers + +r[macro.invocation.extern] * [External blocks] +r[macro.invocation.item-statement] When used as an item or a statement, the _MacroInvocationSemi_ form is used where a semicolon is required at the end when not using curly braces. [Visibility qualifiers] are never allowed before a macro invocation or diff --git a/src/memory-allocation-and-lifetime.md b/src/memory-allocation-and-lifetime.md index 7a5bfc12c..267afbf52 100644 --- a/src/memory-allocation-and-lifetime.md +++ b/src/memory-allocation-and-lifetime.md @@ -1,9 +1,13 @@ # Memory allocation and lifetime +r[alloc] + +r[alloc.static] The _items_ of a program are those functions, modules, and types that have their value calculated at compile-time and stored uniquely in the memory image of the rust process. Items are neither dynamically allocated nor freed. +r[alloc.dynamic] The _heap_ is a general term that describes boxes. The lifetime of an allocation in the heap depends on the lifetime of the box values pointing to it. Since box values may themselves be passed in and out of frames, or stored diff --git a/src/names.md b/src/names.md index 44a20ab54..722fa0dbf 100644 --- a/src/names.md +++ b/src/names.md @@ -1,34 +1,49 @@ # Names +r[name] + +r[name.intro] An *entity* is a language construct that can be referred to in some way within the source program, usually via a [path]. Entities include [types], [items], [generic parameters], [variable bindings], [loop labels], [lifetimes], [fields], [attributes], and [lints]. +r[name.decl] A *declaration* is a syntactical construct that can introduce a *name* to refer to an entity. Entity names are valid within a [*scope*] --- a region of source text where that name may be referenced. +r[name.explicit-decl] Some entities are [explicitly declared](#explicitly-declared-entities) in the source code, and some are [implicitly declared](#implicitly-declared-entities) as part of the language or compiler extensions. -[*Paths*] are used to refer to an entity, possibly in another module or type. Lifetimes -and loop labels use a [dedicated syntax][lifetimes-and-loop-labels] using a +r[name.path] +[*Paths*] are used to refer to an entity, possibly in another module or type. + +r[name.lifetime] +Lifetimes and loop labels use a [dedicated syntax][lifetimes-and-loop-labels] using a leading quote. +r[name.namespace] Names are segregated into different [*namespaces*], allowing entities in different namespaces to share the same name without conflict. +r[name.resolution] [*Name resolution*] is the compile-time process of tying paths, identifiers, and labels to entity declarations. +r[name.visibility] Access to certain names may be restricted based on their [*visibility*]. ## Explicitly declared entities +r[name.explicit] + +r[name.explicit.list] Entities that explicitly introduce a name in the source code are: +r[name.explicit.item-decl] * [Items]: * [Module declarations] * [External crate declarations] @@ -43,6 +58,8 @@ Entities that explicitly introduce a name in the source code are: * [External block items] * [`macro_rules` declarations] and [matcher metavariables] * [Implementation] associated items + +r[name.explicit.expr] * [Expressions]: * [Closure] parameters * [`while let`] pattern bindings @@ -50,35 +67,67 @@ Entities that explicitly introduce a name in the source code are: * [`if let`] pattern bindings * [`match`] pattern bindings * [Loop labels] + +r[name.explicit.generics] * [Generic parameters] + +r[name.explicit.higher-ranked-bounds] * [Higher ranked trait bounds] + +r[name.explicit.binding] * [`let` statement] pattern bindings + +r[name.explicit.macro_use] * The [`macro_use` attribute] can introduce macro names from another crate + +r[name.explicit.macro_export] * The [`macro_export` attribute] can introduce an alias for the macro into the crate root +r[name.explicit.macro-invocation] Additionally, [macro invocations] and [attributes] can introduce names by expanding to one of the above items. ## Implicitly declared entities +r[name.implicit] + +r[name.implicit.list] The following entities are implicitly defined by the language, or are introduced by compiler options and extensions: +r[name.implicit.primitive-types] * [Language prelude]: * [Boolean type] --- `bool` * [Textual types] --- `char` and `str` * [Integer types] --- `i8`, `i16`, `i32`, `i64`, `i128`, `u8`, `u16`, `u32`, `u64`, `u128` * [Machine-dependent integer types] --- `usize` and `isize` * [floating-point types] --- `f32` and `f64` + +r[name.implicit.builtin-attributes] * [Built-in attributes] + +r[name.implicit.prelude] * [Standard library prelude] items, attributes, and macros + +r[name.implicit.stdlib] * [Standard library][extern-prelude] crates in the root module + +r[name.implicit.extern-prelude] * [External crates][extern-prelude] linked by the compiler + +r[name.implicit.tool-attributes] * [Tool attributes] + +r[name.implicit.lints] * [Lints] and [tool lint attributes] + +r[name.implicit.derive-helpers] * [Derive helper attributes] are valid within an item without being explicitly imported + +r[name.implicit.lifetime-static] * The [`'static`] lifetime +r[name.implicit.root] Additionally, the crate root module does not have a name, but can be referred to with certain [path qualifiers] or aliases. diff --git a/src/paths.md b/src/paths.md index 12748101d..c57984d3c 100644 --- a/src/paths.md +++ b/src/paths.md @@ -1,5 +1,8 @@ # Paths +r[path] + +r[path.intro] A *path* is a sequence of one or more path segments separated by `::` tokens. Paths are used to refer to [items], values, [types], [macros], and [attributes]. @@ -15,6 +18,9 @@ x::y::z; ### Simple Paths +r[path.simple] + +r[path.simple.syntax] > **Syntax**\ > _SimplePath_ :\ >    `::`? _SimplePathSegment_ (`::` _SimplePathSegment_)\* @@ -22,6 +28,7 @@ x::y::z; > _SimplePathSegment_ :\ >    [IDENTIFIER] | `super` | `self` | `crate` | `$crate` +r[path.simple.intro] Simple paths are used in [visibility] markers, [attributes], [macros][mbe], and [`use`] items. For example: @@ -35,6 +42,9 @@ mod m { ### Paths in expressions +r[path.expr] + +r[path.expr.syntax] > **Syntax**\ > _PathInExpression_ :\ >    `::`? _PathExprSegment_ (`::` _PathExprSegment_)\* @@ -64,9 +74,11 @@ mod m { > _GenericArgsBounds_ :\ >    [IDENTIFIER] _GenericArgs_? `:` [_TypeParamBounds_] +r[path.expr.intro] Paths in expressions allow for paths with generic arguments to be specified. They are used in various places in [expressions] and [patterns]. +r[path.expr.turbofish] The `::` token is required before the opening `<` for generic arguments to avoid ambiguity with the less-than operator. This is colloquially known as "turbofish" syntax. @@ -75,17 +87,23 @@ ambiguity with the less-than operator. This is colloquially known as "turbofish" Vec::::with_capacity(1024); ``` +r[path.expr.argument-order] The order of generic arguments is restricted to lifetime arguments, then type arguments, then const arguments, then equality constraints. +r[path.expr.complex-const-params] Const arguments must be surrounded by braces unless they are a [literal] or a single segment path. +r[path.expr.impl-trait-params] The synthetic type parameters corresponding to `impl Trait` types are implicit, and these cannot be explicitly specified. ## Qualified paths +r[path.qualified] + +r[path.qualified.syntax] > **Syntax**\ > _QualifiedPathInExpression_ :\ >    _QualifiedPathType_ (`::` _PathExprSegment_)+ @@ -96,6 +114,7 @@ and these cannot be explicitly specified. > _QualifiedPathInType_ :\ >    _QualifiedPathType_ (`::` _TypePathSegment_)+ +r[path.qualified.intro] Fully qualified paths allow for disambiguating the path for [trait implementations] and for specifying [canonical paths](#canonical-paths). When used in a type specification, it supports using the type syntax specified below. @@ -120,6 +139,9 @@ S::f(); // Calls the inherent impl. ### Paths in types +r[path.type] + +r[path.type.syntax] > **Syntax**\ > _TypePath_ :\ >    `::`? _TypePathSegment_ (`::` _TypePathSegment_)\* @@ -133,9 +155,11 @@ S::f(); // Calls the inherent impl. > _TypePathFnInputs_ :\ > [_Type_] (`,` [_Type_])\* `,`? +r[path.type.intro] Type paths are used within type definitions, trait bounds, type parameter bounds, and qualified paths. +r[path.type.turbofish] Although the `::` token is allowed before the generics arguments, it is not required because there is no ambiguity like there is in _PathInExpression_. @@ -157,16 +181,22 @@ type G = std::boxed::Box isize>; ## Path qualifiers +r[path.qualifier] + Paths can be denoted with various leading qualifiers to change the meaning of how it is resolved. ### `::` +r[path.qualifier.global-root] + +r[path.qualifier.global-root.intro] Paths starting with `::` are considered to be *global paths* where the segments of the path start being resolved from a place which differs based on edition. Each identifier in the path must resolve to an item. -> **Edition differences**: In the 2015 Edition, identifiers resolve from the "crate root" +r[path.qualifier.global-root.edition2015] +> **Edition Differences**: In the 2015 Edition, identifiers resolve from the "crate root" > (`crate::` in the 2018 edition), which contains a variety of different items, including > external crates, default crates such as `std` or `core`, and items in the top level of > the crate (including `use` imports). @@ -199,9 +229,15 @@ mod b { ### `self` -`self` resolves the path relative to the current module. `self` can only be used as the -first segment, without a preceding `::`. +r[path.qualifier.mod-self] + +r[path.qualifier.mod-self.intro] +`self` resolves the path relative to the current module. + +r[path.qualifier.mod-self.restriction] +`self` can only be used as the first segment, without a preceding `::`. +r[path.qualifier.self-pat] In a method body, a path which consists of a single `self` segment resolves to the method's self parameter. @@ -221,16 +257,26 @@ impl S { ### `Self` +r[path.qualifier.type-self] + +r[path.qualifier.type-self.intro] `Self`, with a capital "S", is used to refer to the current type being implemented or defined. It may be used in the following situations: +r[path.qualifier.type-self.trait] * In a [trait] definition, it refers to the type implementing the trait. + +r[path.qualifier.type-self.impl] * In an [implementation], it refers to the type being implemented. When implementing a tuple or unit [struct], it also refers to the constructor in the [value namespace]. + +r[path.qualifier.type-self.type] * In the definition of a [struct], [enumeration], or [union], it refers to the type being defined. The definition is not allowed to be infinitely recursive (there must be an indirection). +r[path.qualifier.type-self.scope] The scope of `Self` behaves similarly to a generic parameter; see the [`Self` scope] section for more details. +r[path.qualifier.type-self.restriction] `Self` can only be used as the first segment, without a preceding `::`. The `Self` path cannot include generic arguments (as in `Self::`). @@ -274,8 +320,13 @@ struct NonEmptyList { ### `super` -`super` in a path resolves to the parent module. It may only be used in leading -segments of the path, possibly after an initial `self` segment. +r[path.qualifier.super] + +r[path.qualifier.super.intro] +`super` in a path resolves to the parent module. + +r[path.qualifier.super.restriction] +It may only be used in leading segments of the path, possibly after an initial `self` segment. ```rust mod a { @@ -289,6 +340,7 @@ mod b { # fn main() {} ``` +r[path.qualifier.super.repetion] `super` may be repeated several times after the first `super` or `self` to refer to ancestor modules. @@ -310,8 +362,13 @@ mod a { ### `crate` -`crate` resolves the path relative to the current crate. `crate` can only be used as the -first segment, without a preceding `::`. +r[path.qualifier.crate] + +r[path.qualifier.crate.intro] +`crate` resolves the path relative to the current crate. + +r[path.qualifier.crate.restriction] +`crate` can only be used as the first segment, without a preceding `::`. ```rust fn foo() {} @@ -325,8 +382,14 @@ mod a { ### `$crate` +r[path.qualifier.macro-crate] + +r[path.qualifier.macro-crate.restriction] `$crate` is only used within [macro transcribers], and can only be used as the first -segment, without a preceding `::`. `$crate` will expand to a path to access items from the +segment, without a preceding `::`. + +r[path.qualifier.macro-crate.hygiene] +`$crate` will expand to a path to access items from the top level of the crate where the macro is defined, regardless of which crate the macro is invoked. @@ -344,11 +407,20 @@ macro_rules! inc { ## Canonical paths +r[path.canonical] + +r[path.canonical.intro] Items defined in a module or implementation have a *canonical path* that -corresponds to where within its crate it is defined. All other paths to these -items are aliases. The canonical path is defined as a *path prefix* appended by +corresponds to where within its crate it is defined. + +r[path.canonical.alias] +All other paths to these items are aliases. + +r[path.canonical.def] +The canonical path is defined as a *path prefix* appended by the path segment the item itself defines. +r[path.canonical.non-canonical] [Implementations] and [use declarations] do not have canonical paths, although the items that implementations define do have them. Items defined in block expressions do not have canonical paths. Items defined in a module that @@ -357,13 +429,19 @@ defined in an implementation that refers to an item without a canonical path, e.g. as the implementing type, the trait being implemented, a type parameter or bound on a type parameter, do not have canonical paths. -The path prefix for modules is the canonical path to that module. For bare -implementations, it is the canonical path of the item being implemented -surrounded by angle (`<>`) brackets. For -[trait implementations], it is the canonical path of the item being implemented +r[path.canonical.module-prefix] +The path prefix for modules is the canonical path to that module. + +r[path.canonical.bare-impl-prefix] +For bare implementations, it is the canonical path of the item being implemented +surrounded by angle (`<>`) brackets. + +r[path.canonical.trait-impl-preifx] +For [trait implementations], it is the canonical path of the item being implemented followed by `as` followed by the canonical path to the trait all surrounded in angle (`<>`) brackets. +r[path.canonical.local-canonical-path] The canonical path is only meaningful within a given crate. There is no global namespace across crates; an item's canonical path merely identifies it within the crate. From de598eb916ed782e9c96d5991a6a5b9effde3ccb Mon Sep 17 00:00:00 2001 From: Connor Horman Date: Thu, 29 Aug 2024 10:17:22 -0400 Subject: [PATCH 2/4] Add identifier syntax to patterns, runtime, and special-types-and-traits --- src/patterns.md | 190 ++++++++++++++++++++++++++++++++ src/runtime.md | 39 +++++-- src/special-types-and-traits.md | 102 ++++++++++++++++- 3 files changed, 318 insertions(+), 13 deletions(-) diff --git a/src/patterns.md b/src/patterns.md index 305699a74..600313e52 100644 --- a/src/patterns.md +++ b/src/patterns.md @@ -1,5 +1,8 @@ # Patterns +r[pattern] + +r[pattern.syntax] > **Syntax**\ > _Pattern_ :\ >       `|`? _PatternNoTopAlt_ ( `|` _PatternNoTopAlt_ )\* @@ -22,6 +25,7 @@ >    | [_PathPattern_]\ >    | [_MacroInvocation_] +r[pattern.intro] Patterns are used to match values against structures and to, optionally, bind variables to values inside these structures. They are also used in variable declarations and parameters for functions and closures. @@ -60,21 +64,40 @@ if let } ``` +r[pattern.usage] Patterns are used in: +r[pattern.let] * [`let` declarations](statements.md#let-statements) + +r[pattern.param] * [Function](items/functions.md) and [closure](expressions/closure-expr.md) parameters + +r[pattern.match] * [`match` expressions](expressions/match-expr.md) + +r[pattern.if-let] * [`if let` expressions](expressions/if-expr.md) + +r[pattern.while-let] * [`while let` expressions](expressions/loop-expr.md#predicate-pattern-loops) + +r[pattern.for] * [`for` expressions](expressions/loop-expr.md#iterator-loops) ## Destructuring +r[pattern.destructure] + +r[pattern.destructure.intro] Patterns can be used to *destructure* [structs], [enums], and [tuples]. Destructuring breaks up a value into its component pieces. The syntax used is almost the same as when creating such values. + +r[pattern.destructure.placeholder] In a pattern whose [scrutinee] expression has a `struct`, `enum` or `tuple` type, a placeholder (`_`) stands in for a *single* data field, whereas a wildcard `..` stands in for *all* the remaining fields of a particular variant. + +r[pattern.destructure.named-field-shorthand] When destructuring a data structure with named (but not numbered) fields, it is allowed to write `fieldname` as a shorthand for `fieldname: fieldname`. ```rust @@ -98,6 +121,8 @@ match message { ## Refutability +r[pattern.refutable] + A pattern is said to be *refutable* when it has the possibility of not being matched by the value it is being matched against. *Irrefutable* patterns, on the other hand, always match the value they are being matched against. Examples: @@ -114,6 +139,9 @@ if let (a, 3) = (1, 2) { // "(a, 3)" is refutable, and will not match ## Literal patterns +r[pattern.literal] + +r[pattern.literal.syntax] > **Syntax**\ > _LiteralPattern_ :\ >       `true` | `false`\ @@ -139,12 +167,14 @@ if let (a, 3) = (1, 2) { // "(a, 3)" is refutable, and will not match [INTEGER_LITERAL]: tokens.md#integer-literals [FLOAT_LITERAL]: tokens.md#floating-point-literals +r[pattern.literal.intro] _Literal patterns_ match exactly the same value as what is created by the literal. Since negative numbers are not [literals], literal patterns also accept an optional minus sign before the literal, which acts like the negation operator. > [!WARNING] > C string and raw C string literals are accepted in literal patterns, but `&CStr` doesn't implement structural equality (`#[derive(Eq, PartialEq)]`) and therefore any such `match` on a `&CStr` will be rejected with a type error. +r[pattern.literal.refutable] Literal patterns are always refutable. Examples: @@ -162,15 +192,24 @@ for i in -2..5 { ## Identifier patterns +r[pattern.ident] + +r[pattern.ident.syntax] > **Syntax**\ > _IdentifierPattern_ :\ >       `ref`? `mut`? [IDENTIFIER] (`@` [_PatternNoTopAlt_] ) ? +r[pattern.ident.intro] Identifier patterns bind the value they match to a variable in the [value namespace]. + +r[pattern.ident.unique] The identifier must be unique within the pattern. + +r[pattern.ident.scope] The variable will shadow any variables of the same name in scope. The [scope] of the new binding depends on the context of where the pattern is used (such as a `let` binding or a `match` arm). +r[pattern.ident.bare] Patterns that consist of only an identifier, possibly with a `mut`, match any value and bind it to that identifier. This is the most commonly used pattern in variable declarations and parameters for functions and closures. @@ -181,6 +220,7 @@ fn sum(x: i32, y: i32) -> i32 { # } ``` +r[pattern.ident.scrutinized] To bind the matched value of a pattern to a variable, use the syntax `variable @ subpattern`. For example, the following binds the value 2 to `e` (not the entire range: the range here is a range subpattern). @@ -193,7 +233,10 @@ match x { } ``` +r[pattern.ident.move] By default, identifier patterns bind a variable to a copy of or move from the matched value depending on whether the matched value implements [`Copy`]. + +r[pattern.ident.ref] This can be changed to bind to a reference by using the `ref` keyword, or to a mutable reference using `ref mut`. For example: ```rust @@ -234,16 +277,24 @@ To make it valid, write the following: if let Person {name: ref person_name, age: 18..=150 } = value { } ``` +r[pattern.ident.ref-ignored] Thus, `ref` is not something that is being matched against. Its objective is exclusively to make the matched binding a reference, instead of potentially copying or moving what was matched. +r[pattern.ident.precedent] [Path patterns](#path-patterns) take precedence over identifier patterns. + +r[pattern.ident.constraint] It is an error if `ref` or `ref mut` is specified and the identifier shadows a constant. +r[pattern.ident.refutable] Identifier patterns are irrefutable if the `@` subpattern is irrefutable or the subpattern is not specified. ### Binding modes +r[pattern.ident.binding] + +r[pattern.ident.binding.intro] To service better ergonomics, patterns operate in different *binding modes* in order to make it easier to bind references to values. When a reference value is matched by a non-reference pattern, it will be automatically treated as a `ref` or `ref mut` binding. Example: @@ -255,16 +306,31 @@ if let Some(y) = x { } ``` +r[pattern.ident.binding.non-reference] *Non-reference patterns* include all patterns except bindings, [wildcard patterns](#wildcard-pattern) (`_`), [`const` patterns](#path-patterns) of reference types, and [reference patterns](#reference-patterns). +r[pattern.ident.binding.default-mode] If a binding pattern does not explicitly have `ref`, `ref mut`, or `mut`, then it uses the *default binding mode* to determine how the variable is bound. + +r[pattern.ident.binding.move] The default binding mode starts in "move" mode which uses move semantics. + +r[pattern.ident.binding.top-down] When matching a pattern, the compiler starts from the outside of the pattern and works inwards. + +r[pattern.ident.binding.auto-deref] Each time a reference is matched using a non-reference pattern, it will automatically dereference the value and update the default binding mode. + +r[pattern.ident.binding.ref] References will set the default binding mode to `ref`. + +r[pattern.ident.binding.ref-mut] Mutable references will set the mode to `ref mut` unless the mode is already `ref` in which case it remains `ref`. + +r[pattern.ident.binding.nested-references] If the automatically dereferenced value is still a reference, it is dereferenced and this process repeats. +r[pattern.ident.binding.mixed] Move bindings and reference bindings can be mixed together in the same pattern. Doing so will result in partial move of the object bound to and the object cannot be used afterwards. This applies only if the type cannot be copied. @@ -286,13 +352,21 @@ let Person { name, ref age } = person; ## Wildcard pattern +r[pattern.wildcard] + +r[pattern.wildcard] > **Syntax**\ > _WildcardPattern_ :\ >    `_` +r[pattern.wildcard.intro] The _wildcard pattern_ (an underscore symbol) matches any value. It is used to ignore values when they don't matter. + +r[pattern.wildcard.struct-matcher] Inside other patterns it matches a single data field (as opposed to the `..` which matches the remaining fields). + +r[pattern.wildcard.no-binding] Unlike identifier patterns, it does not copy, move or borrow the value it matches. Examples: @@ -323,18 +397,25 @@ let RGBA{r: red, g: green, b: blue, a: _} = color; if let Some(_) = x {} ``` +r[pattern.wildcard.refutable] The wildcard pattern is always irrefutable. ## Rest patterns +r[pattern.rest] + > **Syntax**\ > _RestPattern_ :\ >    `..` +r[pattern.rest.intro] The _rest pattern_ (the `..` token) acts as a variable-length pattern which matches zero or more elements that haven't been matched already before and after. + +r[pattern.rest.constraint] It may only be used in [tuple](#tuple-patterns), [tuple struct](#tuple-struct-patterns), and [slice](#slice-patterns) patterns, and may only appear once as one of the elements in those patterns. It is also allowed in an [identifier pattern](#identifier-patterns) for [slice patterns](#slice-patterns) only. +r[pattern.rest.refutable] The rest pattern is always irrefutable. Examples: @@ -379,6 +460,9 @@ match tuple { ## Range patterns +r[pattern.range] + +r[pattern.range.syntax] > **Syntax**\ > _RangePattern_ :\ >       _RangeInclusivePattern_\ @@ -408,44 +492,67 @@ match tuple { >    | `-`? [FLOAT_LITERAL]\ >    | [_PathExpression_] +r[pattern.range.intro] *Range patterns* match scalar values within the range defined by their bounds. They comprise a *sigil* (one of `..`, `..=`, or `...`) and a bound on one or both sides. + +r[pattern.range.lower-bound] A bound on the left of the sigil is a *lower bound*. + +r[pattern.range.upper-bound] A bound on the right is an *upper bound*. +r[pattern.range.closed] A range pattern with both a lower and upper bound will match all values between and including both of its bounds. It is written as its lower bound, followed by `..` for end-exclusive or `..=` for end-inclusive, followed by its upper bound. + +r[pattern.range.type] The type of the range pattern is the type unification of its upper and lower bounds. For example, a pattern `'m'..='p'` will match only the values `'m'`, `'n'`, `'o'`, and `'p'`. Similarly, `'m'..'p'` will match only `'m'`, `'n'` and `'o'`, specifically **not** including `'p'`. +r[pattern.range.constraint-less-than] The lower bound cannot be greater than the upper bound. That is, in `a..=b`, a ≤ b must be the case. For example, it is an error to have a range pattern `10..=0`. +r[pattern.range.open-below] A range pattern with only a lower bound will match any value greater than or equal to the lower bound. It is written as its lower bound followed by `..`, and has the same type as its lower bound. For example, `1..` will match 1, 9, or 9001, or 9007199254740991 (if it is of an appropriate size), but not 0, and not negative numbers for signed integers. +r[pattern.range.open-above] A range pattern with only an upper bound matches any value less than or equal to the upper bound. It is written as `..=` followed by its upper bound, and has the same type as its upper bound. For example, `..=10` will match 10, 1, 0, and for signed integer types, all negative values. +r[pattern.range.constraint-slice] Range patterns with only one bound cannot be used as the top-level pattern for subpatterns in [slice patterns](#slice-patterns). +r[pattern.range.bound] The bounds is written as one of: * A character, byte, integer, or float literal. * A `-` followed by an integer or float literal. * A [path] +r[pattern.range.constraint-bound-path] If the bounds is written as a path, after macro resolution, the path must resolve to a constant item of the type `char`, an integer type, or a float type. +r[pattern.range.value] The type and value of the bounds is dependent upon how it is written out. + +r[pattern.range.path-value] If the bounds is a [path], the pattern has the type and value of the [constant] the path resolves to. + +r[pattern.range.float-restriction] For float range patterns, the constant may not be a `NaN`. + +r[pattern.range.literal-value] If it is a literal, it has the type and value of the corresponding [literal expression]. + +r[pattern.range.negation] If is a literal preceded by a `-`, it has the same type as the corresponding [literal expression] and the value of [negating] the value of the corresponding literal expression. Examples: @@ -523,19 +630,28 @@ println!("{}", match 0xfacade { }); ``` +r[pattern.range.refutable] Range patterns for fix-width integer and `char` types are irrefutable when they span the entire set of possible values of a type. For example, `0u8..=255u8` is irrefutable. + +r[pattern.range.refutable-integer] The range of values for an integer type is the closed range from its minimum to maximum value. + +r[pattern.range.refutable-char] The range of values for a `char` type are precisely those ranges containing all Unicode Scalar Values: `'\u{0000}'..='\u{D7FF}'` and `'\u{E000}'..='\u{10FFFF}'`. > **Edition differences**: Before the 2021 edition, range patterns with both a lower and upper bound may also be written using `...` in place of `..=`, with the same meaning. ## Reference patterns +r[pattern.ref] + +r[pattern.ref.syntax] > **Syntax**\ > _ReferencePattern_ :\ >    (`&`|`&&`) `mut`? [_PatternWithoutRange_] +r[pattern.ref.intro] Reference patterns dereference the pointers that are being matched and, thus, borrow them. For example, these two matches on `x: &i32` are equivalent: @@ -549,14 +665,20 @@ let b = match int_reference { &0 => "zero", _ => "some" }; assert_eq!(a, b); ``` +r[pattern.ref.ref-ref] The grammar production for reference patterns has to match the token `&&` to match a reference to a reference because it is a token by itself, not two `&` tokens. +r[pattern.ref.mut] Adding the `mut` keyword dereferences a mutable reference. The mutability must match the mutability of the reference. +r[pattern.ref.refutable] Reference patterns are always irrefutable. ## Struct patterns +r[pattern.struct] + +r[pattern.struct.syntax] > **Syntax**\ > _StructPattern_ :\ >    [_PathInExpression_] `{`\ @@ -585,9 +707,11 @@ Reference patterns are always irrefutable. [_OuterAttribute_]: attributes.md [TUPLE_INDEX]: tokens.md#tuple-index +r[pattern.struct.intro] Struct patterns match struct, enum, and union values that match all criteria defined by its subpatterns. They are also used to [destructure](#destructuring) a struct, enum, or union value. +r[pattern.struct.ignore-rest] On a struct pattern, the fields are referenced by name, index (in the case of tuple structs) or ignored by use of `..`: ```rust @@ -630,6 +754,7 @@ match m { } ``` +r[pattern.struct.constraint-struct] If `..` is not used, a struct pattern used to match a struct is required to specify all fields: ```rust @@ -649,8 +774,10 @@ match struct_value { } ``` +r[pattern.struct.constraint-union] A struct pattern used to match a union must specify exactly one field (see [Pattern matching on unions]). +r[pattern.struct.binding-shorthand] The `ref` and/or `mut` _IDENTIFIER_ syntax matches any value and binds it to a variable with the same name as the given field. ```rust @@ -664,10 +791,14 @@ The `ref` and/or `mut` _IDENTIFIER_ syntax matches any value and binds it to a v let Struct{a: x, b: y, c: z} = struct_value; // destructure all fields ``` +r[pattern.struct.refutable] A struct pattern is refutable if the _PathInExpression_ resolves to a constructor of an enum with more than one variant, or one of its subpatterns is refutable. ## Tuple struct patterns +r[pattern.tuple-struct] + +r[pattern.tuple-struct.syntax] > **Syntax**\ > _TupleStructPattern_ :\ >    [_PathInExpression_] `(` _TupleStructItems_? `)` @@ -675,13 +806,18 @@ A struct pattern is refutable if the _PathInExpression_ resolves to a constructo > _TupleStructItems_ :\ >    [_Pattern_] ( `,` [_Pattern_] )\* `,`? +r[pattern.tuple-struct.intro] Tuple struct patterns match tuple struct and enum values that match all criteria defined by its subpatterns. They are also used to [destructure](#destructuring) a tuple struct or enum value. +r[pattern.tuple-struct.refutable] A tuple struct pattern is refutable if the _PathInExpression_ resolves to a constructor of an enum with more than one variant, or one of its subpatterns is refutable. ## Tuple patterns +r[pattern.tuple] + +r[pattern.tuple.syntax] > **Syntax**\ > _TuplePattern_ :\ >    `(` _TuplePatternItems_? `)` @@ -691,11 +827,14 @@ A tuple struct pattern is refutable if the _PathInExpression_ resolves to a cons >    | [_RestPattern_]\ >    | [_Pattern_] (`,` [_Pattern_])+ `,`? +r[pattern.tuple.intro] Tuple patterns match tuple values that match all criteria defined by its subpatterns. They are also used to [destructure](#destructuring) a tuple. +r[pattern.tuple.rest-syntax] The form `(..)` with a single [_RestPattern_] is a special form that does not require a comma, and matches a tuple of any size. +r[pattern.tuple.refutable] The tuple pattern is refutable when one of its subpatterns is refutable. An example of using tuple patterns: @@ -710,10 +849,14 @@ assert_eq!(b, "ten"); ## Grouped patterns +r[pattern.paren] + +r[pattern.paren.syntax] > **Syntax**\ > _GroupedPattern_ :\ >    `(` [_Pattern_] `)` +r[pattern.paren.intro] Enclosing a pattern in parentheses can be used to explicitly control the precedence of compound patterns. For example, a reference pattern next to a range pattern such as `&0..=5` is ambiguous and is not allowed, but can be expressed with parentheses. @@ -727,6 +870,9 @@ match int_reference { ## Slice patterns +r[pattern.slice] + +r[pattern.slice.syntax] > **Syntax**\ > _SlicePattern_ :\ >    `[` _SlicePatternItems_? `]` @@ -734,6 +880,7 @@ match int_reference { > _SlicePatternItems_ :\ >    [_Pattern_] \(`,` [_Pattern_])\* `,`? +r[pattern.slice.intro] Slice patterns can match both arrays of fixed size and slices of dynamic size. ```rust @@ -754,21 +901,30 @@ match v[..] { }; ``` +r[pattern.slice.refutable-array] Slice patterns are irrefutable when matching an array as long as each element is irrefutable. + +r[pattern.slice.refutable-slice] When matching a slice, it is irrefutable only in the form with a single `..` [rest pattern](#rest-patterns) or [identifier pattern](#identifier-patterns) with the `..` rest pattern as a subpattern. +r[pattern.slice.restriction] Within a slice, a range pattern without both lower and upper bound must be enclosed in parentheses, as in `(a..)`, to clarify it is intended to match against a single slice element. A range pattern with both lower and upper bound, like `a..=b`, is not required to be enclosed in parentheses. ## Path patterns +r[pattern.path] + +r[pattern.path.syntax] > **Syntax**\ > _PathPattern_ :\ >       [_PathExpression_] +r[pattern.path.intro] _Path patterns_ are patterns that refer either to constant values or to structs or enum variants that have no fields. +r[pattern.path.unqualified] Unqualified path patterns can refer to: * enum variants @@ -776,41 +932,68 @@ Unqualified path patterns can refer to: * constants * associated constants +r[pattern.path.qualified] Qualified path patterns can only refer to associated constants. +r[pattern.path.refutable] Path patterns are irrefutable when they refer to structs or an enum variant when the enum has only one variant or a constant whose type is irrefutable. They are refutable when they refer to refutable constants or enum variants for enums with multiple variants. ### Constant patterns +r[pattern.const] + +r[pattern.const.partial-eq] When a constant `C` of type `T` is used as a pattern, we first check that `T: PartialEq`. + +r[pattern.const.structural-equality] Furthermore we require that the value of `C` *has (recursive) structural equality*, which is defined recursively as follows: +r[pattern.const.primitive] - Integers as well as `str`, `bool` and `char` values always have structural equality. + +r[pattern.const.builtin-aggregate] - Tuples, arrays, and slices have structural equality if all their fields/elements have structural equality. (In particular, `()` and `[]` always have structural equality.) + +r[pattern.const.ref] - References have structural equality if the value they point to has structural equality. + +r[pattern.const.aggregate] - A value of `struct` or `enum` type has structural equality if its `PartialEq` instance is derived via `#[derive(PartialEq)]`, and all fields (for enums: of the active variant) have structural equality. + +r[pattern.const.pointer] - A raw pointer has structural equality if it was defined as a constant integer (and then cast/transmuted). + +r[pattern.const.float] - A float value has structural equality if it is not a `NaN`. + +r[pattern.const.exhaustive] - Nothing else has structural equality. +r[pattern.const.generic] In particular, the value of `C` must be known at pattern-building time (which is pre-monomorphization). This means that associated consts that involve generic parameters cannot be used as patterns. +r[pattern.const.translation] After ensuring all conditions are met, the constant value is translated into a pattern, and now behaves exactly as-if that pattern had been written directly. In particular, it fully participates in exhaustiveness checking. (For raw pointers, constants are the only way to write such patterns. Only `_` is ever considered exhaustive for these types.) ## Or-patterns +r[pattern.or] + _Or-patterns_ are patterns that match on one of two or more sub-patterns (for example `A | B | C`). They can nest arbitrarily. Syntactically, or-patterns are allowed in any of the places where other patterns are allowed (represented by the _Pattern_ production), with the exceptions of `let`-bindings and function and closure arguments (represented by the _PatternNoTopAlt_ production). ### Static semantics +r[pattern.constraints] + +r[pattern.constraints.pattern] 1. Given a pattern `p | q` at some depth for some arbitrary patterns `p` and `q`, the pattern is considered ill-formed if: + the type inferred for `p` does not unify with the type inferred for `q`, or @@ -819,12 +1002,14 @@ Syntactically, or-patterns are allowed in any of the places where other patterns Unification of types is in all instances aforementioned exact and implicit [type coercions] do not apply. +r[pattern.constraints.match-type-check] 2. When type checking an expression `match e_s { a_1 => e_1, ... a_n => e_n }`, for each match arm `a_i` which contains a pattern of form `p_i | q_i`, the pattern `p_i | q_i` is considered ill formed if, at the depth `d` where it exists the fragment of `e_s` at depth `d`, the type of the expression fragment does not unify with `p_i | q_i`. +r[pattern.constraints.exhaustiveness-or-pattern] 3. With respect to exhaustiveness checking, a pattern `p | q` is considered to cover `p` as well as `q`. For some constructor `c(x, ..)` the distributive law applies such that `c(p | q, ..rest)` covers the same set of value as `c(p, ..rest) | c(q, ..rest)` does. This can be applied recursively until there are no more nested patterns of form `p | q` other than those that exist at the top level. @@ -834,6 +1019,9 @@ Syntactically, or-patterns are allowed in any of the places where other patterns ### Dynamic semantics +r[pattern.behavior] + +r[pattern.behavior.nested-or-patterns] 1. The dynamic semantics of pattern matching a scrutinee expression `e_s` against a pattern `c(p | q, ..rest)` at depth `d` where `c` is some constructor, `p` and `q` are arbitrary patterns, and `rest` is optionally any remaining potential factors in `c`, @@ -841,6 +1029,8 @@ Syntactically, or-patterns are allowed in any of the places where other patterns ### Precedence with other undelimited patterns +r[pattern.precedence] + As shown elsewhere in this chapter, there are several types of patterns that are syntactically undelimited, including identifier patterns, reference patterns, and or-patterns. Or-patterns always have the lowest-precedence. This allows us to reserve syntactic space for a possible future type ascription feature and also to reduce ambiguity. diff --git a/src/runtime.md b/src/runtime.md index a673834f8..d5b7e5b16 100644 --- a/src/runtime.md +++ b/src/runtime.md @@ -1,13 +1,25 @@ # The Rust runtime +r[runtime] + This section documents features that define some aspects of the Rust runtime. ## The `panic_handler` attribute +r[runtime.panic_handler] + +r[runtime.panic_handler.constraint] The *`panic_handler` attribute* can only be applied to a function with signature -`fn(&PanicInfo) -> !`. The function marked with this [attribute] defines the behavior of panics. The -[`PanicInfo`] struct contains information about the location of the panic. There must be a single -`panic_handler` function in the dependency graph of a binary, dylib or cdylib crate. +`fn(&PanicInfo) -> !`. + +r[runtime.panic_handler.intro] +The function marked with this [attribute] defines the behavior of panics. + +r[runtime.panic_handler.panic-info] +The [`PanicInfo`] struct contains information about the location of the panic. + +r[runtime.panic_handler.unique] +There must be a single `panic_handler` function in the dependency graph of a binary, dylib or cdylib crate. Below is shown a `panic_handler` function that logs the panic message and then halts the thread. @@ -45,6 +57,8 @@ fn panic(info: &PanicInfo) -> ! { ### Standard behavior +r[runtime.panic_handler.std] + The standard library provides an implementation of `panic_handler` that defaults to unwinding the stack but that can be [changed to abort the process][abort]. The standard library's panic behavior can be modified at @@ -52,21 +66,32 @@ runtime with the [set_hook] function. ## The `global_allocator` attribute +r[runtime.global_allocator] + The *`global_allocator` attribute* is used on a [static item] implementing the [`GlobalAlloc`] trait to set the global allocator. ## The `windows_subsystem` attribute +r[runtime.windows_subsystem] + +r[runtime.windows_subsystem.intro] The *`windows_subsystem` attribute* may be applied at the crate level to set -the [subsystem] when linking on a Windows target. It uses the -[_MetaNameValueStr_] syntax to specify the subsystem with a value of either -`console` or `windows`. This attribute is ignored on non-Windows targets, and -for non-`bin` [crate types]. +the [subsystem] when linking on a Windows target. + +r[runtime.windows_subsystem.restriction] +It uses the [_MetaNameValueStr_] syntax to specify the subsystem with a value of either +`console` or `windows`. + +r[runtime.windows_subsystem.ignored] +This attribute is ignored on non-Windows targets, and for non-`bin` [crate types]. +r[runtime.windows_subsystem.console] The "console" subsystem is the default. If a console process is run from an existing console then it will be attached to that console, otherwise a new console window will be created. +r[runtime.windows_subsystem.windows] The "windows" subsystem is commonly used by GUI applications that do not want to display a console window on startup. It will run detached from any existing console. diff --git a/src/special-types-and-traits.md b/src/special-types-and-traits.md index f7e98323d..054802677 100644 --- a/src/special-types-and-traits.md +++ b/src/special-types-and-traits.md @@ -1,120 +1,199 @@ # Special types and traits +r[lang-types] + +r[lang-types.intro] Certain types and traits that exist in [the standard library] are known to the Rust compiler. This chapter documents the special features of these types and traits. ## `Box` +r[lang-types.box] + +r[lang-types.box.intro] [`Box`] has a few special features that Rust doesn't currently allow for user defined types. +r[lang-types.box.deref] * The [dereference operator] for `Box` produces a place which can be moved from. This means that the `*` operator and the destructor of `Box` are built-in to the language. + +r[lang-types.box.reciever] * [Methods] can take `Box` as a receiver. + +r[lang-types.box.fundamental] * A trait may be implemented for `Box` in the same crate as `T`, which the [orphan rules] prevent for other generic types. + + ## `Rc` +r[lang-types.rc] + +r[lang-types.rc.receiver] [Methods] can take [`Rc`] as a receiver. ## `Arc` +r[lang-types.arc] + +r[lang-types.arc.receiver] [Methods] can take [`Arc`] as a receiver. ## `Pin

` +r[lang-types.pin] + +r[lang-types.pin.receiver] [Methods] can take [`Pin

`] as a receiver. ## `UnsafeCell` +r[lang-types.unsafe-cell] + +r[lang-types.unsafe-cell.interior-mut] [`std::cell::UnsafeCell`] is used for [interior mutability]. It ensures that the compiler doesn't perform optimisations that are incorrect for such types. + +r[lang-types.unsafe-cell.read-only-alloc] It also ensures that [`static` items] which have a type with interior mutability aren't placed in memory marked as read only. ## `PhantomData` +r[lang-types.phantom-data] + [`std::marker::PhantomData`] is a zero-sized, minimum alignment, type that is considered to own a `T` for the purposes of [variance], [drop check], and [auto traits](#auto-traits). ## Operator Traits +r[lang-types.ops] + The traits in [`std::ops`] and [`std::cmp`] are used to overload [operators], [indexing expressions], and [call expressions]. ## `Deref` and `DerefMut` +r[lang-types.deref] + As well as overloading the unary `*` operator, [`Deref`] and [`DerefMut`] are also used in [method resolution] and [deref coercions]. ## `Drop` +r[lang-types.drop] + The [`Drop`] trait provides a [destructor], to be run whenever a value of this type is to be destroyed. ## `Copy` -The [`Copy`] trait changes the semantics of a type implementing it. Values -whose type implements `Copy` are copied rather than moved upon assignment. +r[lang-types.copy] +r[lang-types.copy.intro] +The [`Copy`] trait changes the semantics of a type implementing it. + +r[lang-types.copy.behavior] +Values whose type implements `Copy` are copied rather than moved upon assignment. + +r[lang-types.copy.constraint] `Copy` can only be implemented for types which do not implement `Drop`, and whose fields are all `Copy`. For enums, this means all fields of all variants have to be `Copy`. For unions, this means all variants have to be `Copy`. +r[lang-types.copy.builtin-types] `Copy` is implemented by the compiler for +r[lang-types.copy.tuple] * [Tuples] of `Copy` types + +r[lang-types.copy.fn-pointer] * [Function pointers] + +r[lang-types.copy.fn-item] * [Function items] + +r[lang-types.copy.closure] * [Closures] that capture no values or that only capture values of `Copy` types ## `Clone` +r[lang-types.clone] + +r[lang-types.clone.intro] The [`Clone`] trait is a supertrait of `Copy`, so it also needs compiler -generated implementations. It is implemented by the compiler for the following -types: +generated implementations. + +r[lang-types.clone.builtin-types] +It is implemented by the compiler for the following types: +r[lang-types.clone.builtin-copy] * Types with a built-in `Copy` implementation (see above) + +r[lang-types.clone.tuple] * [Tuples] of `Clone` types + +r[lang-types.clone.closure] * [Closures] that only capture values of `Clone` types or capture no values from the environment ## `Send` +r[lang-types.send] + The [`Send`] trait indicates that a value of this type is safe to send from one thread to another. ## `Sync` +r[lang-types.sync] + +r[lang-types.sync.intro] The [`Sync`] trait indicates that a value of this type is safe to share between -multiple threads. This trait must be implemented for all types used in -immutable [`static` items]. +multiple threads. + +r[lang-types.sync.static-constraint] +This trait must be implemented for all types used in immutable [`static` items]. ## `Termination` +r[lang-types.termination] + The [`Termination`] trait indicates the acceptable return types for the [main function] and [test functions]. ## Auto traits +r[lang-types.auto-traits] + The [`Send`], [`Sync`], [`Unpin`], [`UnwindSafe`], and [`RefUnwindSafe`] traits are _auto traits_. Auto traits have special properties. +r[lang-types.auto-traits.auto-impl] If no explicit implementation or negative implementation is written out for an auto trait for a given type, then the compiler implements it automatically according to the following rules: +r[lang-types.auto-traits.builtin-composite] * `&T`, `&mut T`, `*const T`, `*mut T`, `[T; n]`, and `[T]` implement the trait if `T` does. + +r[lang-types.auto-traits.fn-item-pointer] * Function item types and function pointers automatically implement the trait. + +r[lang-types.auto-traits.aggregate] * Structs, enums, unions, and tuples implement the trait if all of their fields do. + +r[lang-types.auto-traits.closure] * Closures implement the trait if the types of all of their captures do. A closure that captures a `T` by shared reference and a `U` by value implements any auto traits that both `&T` and `U` do. +r[lang-types.auto-traits.generic-impl] For generic types (counting the built-in types above as generic over `T`), if a generic implementation is available, then the compiler does not automatically implement it for types that could use the implementation except that they do not @@ -122,6 +201,7 @@ meet the requisite trait bounds. For instance, the standard library implements `Send` for all `&T` where `T` is `Sync`; this means that the compiler will not implement `Send` for `&T` if `T` is `Send` but not `Sync`. +r[lang-types.auto-traits.negative] Auto traits can also have negative implementations, shown as `impl !AutoTrait for T` in the standard library documentation, that override the automatic implementations. For example `*mut T` has a negative implementation of `Send`, @@ -129,15 +209,25 @@ and so `*mut T` is not `Send`, even if `T` is. There is currently no stable way to specify additional negative implementations; they exist only in the standard library. +r[lang-types.auto-traits.trait-object-marker] Auto traits may be added as an additional bound to any [trait object], even though normally only one trait is allowed. For instance, `Box` is a valid type. ## `Sized` +r[lang-types.sized] + +r[lang-types.sized.intro] The [`Sized`] trait indicates that the size of this type is known at compile-time; that is, it's not a [dynamically sized type]. + +r[lang-types.sized.implicit-sized] [Type parameters] (except `Self` in traits) are `Sized` by default, as are [associated types]. + +r[lang-types.sized.implicit-impl] `Sized` is always implemented automatically by the compiler, not by [implementation items]. + +r[lang-types.sized.relaxation] These implicit `Sized` bounds may be relaxed by using the special `?Sized` bound. [`Arc`]: std::sync::Arc From a5f4253df4426a7eb394f11b5ab3f334e5ebd4e4 Mon Sep 17 00:00:00 2001 From: Connor Horman Date: Thu, 12 Sep 2024 10:34:27 -0400 Subject: [PATCH 3/4] Fix duplicate `pattern.wildcard` --- src/patterns.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/patterns.md b/src/patterns.md index 600313e52..377b56d3d 100644 --- a/src/patterns.md +++ b/src/patterns.md @@ -354,7 +354,7 @@ let Person { name, ref age } = person; r[pattern.wildcard] -r[pattern.wildcard] +r[pattern.wildcard.syntax] > **Syntax**\ > _WildcardPattern_ :\ >    `_` From 572f23e8644c29f38c78b80d5179baf88e19f101 Mon Sep 17 00:00:00 2001 From: Connor Horman Date: Mon, 23 Sep 2024 13:21:01 -0400 Subject: [PATCH 4/4] Fix whitespace style issues in vmodified chapters --- src/lifetime-elision.md | 2 +- src/names.md | 1 - src/paths.md | 1 - 3 files changed, 1 insertion(+), 3 deletions(-) diff --git a/src/lifetime-elision.md b/src/lifetime-elision.md index 70916c4e4..77a01061c 100644 --- a/src/lifetime-elision.md +++ b/src/lifetime-elision.md @@ -99,7 +99,7 @@ _default object lifetime bound_. These were defined in [RFC 599] and amended in r[lifetime-elision.trait-object.explicit-bound] These default object lifetime bounds are used instead of the lifetime parameter -elision rules defined above when the lifetime bound is omitted entirely. +elision rules defined above when the lifetime bound is omitted entirely. r[lifetime-elision.trait-object.explicit-placeholder] If `'_` is used as the lifetime bound then the bound follows the usual elision diff --git a/src/names.md b/src/names.md index 722fa0dbf..a2de804b1 100644 --- a/src/names.md +++ b/src/names.md @@ -131,7 +131,6 @@ r[name.implicit.root] Additionally, the crate root module does not have a name, but can be referred to with certain [path qualifiers] or aliases. - [*Name resolution*]: names/name-resolution.md [*namespaces*]: names/namespaces.md [*paths*]: paths.md diff --git a/src/paths.md b/src/paths.md index c57984d3c..2fc4073c7 100644 --- a/src/paths.md +++ b/src/paths.md @@ -240,7 +240,6 @@ r[path.qualifier.mod-self.restriction] r[path.qualifier.self-pat] In a method body, a path which consists of a single `self` segment resolves to the method's self parameter. - ```rust fn foo() {} fn bar() {