Skip to content

Commit

Permalink
Auto merge of #119427 - dtolnay:maccall, r=compiler-errors
Browse files Browse the repository at this point in the history
Fix, document, and test parser and pretty-printer edge cases related to braced macro calls

_Review note: this is a deceptively small PR because it comes with 145 lines of docs and 196 lines of tests, and only 25 lines of compiler code changed. However, I recommend reviewing it 1 commit at a time because much of the effect of the code changes is non-local i.e. affecting code that is not visible in the final state of the PR. I have paid attention that reviewing the PR one commit at a time is as easy as I can make it. All of the code you need to know about is touched in those commits, even if some of those changes disappear by the end of the stack._

This is a follow-up to #119105. One case that is not relevant to `-Zunpretty=expanded`, but which came up as I'm porting #119105 and #118726 into `syn`'s printer and `prettyplease`'s printer where it **is** relevant, and is also relevant to rustc's `stringify!`, is statement boundaries in the vicinity of braced macro calls.

Rustc's AST pretty-printer produces invalid syntax for statements that begin with a braced macro call:

```rust
macro_rules! stringify_item {
    ($i:item) => {
        stringify!($i)
    };
}

macro_rules! repro {
    ($e:expr) => {
        stringify_item!(fn main() { $e + 1; })
    };
}

fn main() {
    println!("{}", repro!(m! {}));
}
```

**Before this PR:** output is not valid Rust syntax.

```console
fn main() { m! {} + 1; }
```

```console
error: leading `+` is not supported
 --> <anon>:1:19
  |
1 | fn main() { m! {} + 1; }
  |                   ^ unexpected `+`
  |
help: try removing the `+`
  |
1 - fn main() { m! {} + 1; }
1 + fn main() { m! {}  1; }
  |
```

**After this PR:** valid syntax.

```console
fn main() { (m! {}) + 1; }
```
  • Loading branch information
bors committed May 12, 2024
2 parents ee97564 + 78c8dc1 commit 8cc6f34
Show file tree
Hide file tree
Showing 12 changed files with 517 additions and 57 deletions.
98 changes: 76 additions & 22 deletions compiler/rustc_ast/src/util/classify.rs
Original file line number Diff line number Diff line change
@@ -1,34 +1,88 @@
//! Routines the parser uses to classify AST nodes
// Predicates on exprs and stmts that the pretty-printer and parser use
//! Routines the parser and pretty-printer use to classify AST nodes.
use crate::ast::ExprKind::*;
use crate::{ast, token::Delimiter};

/// Does this expression require a semicolon to be treated
/// as a statement? The negation of this: 'can this expression
/// be used as a statement without a semicolon' -- is used
/// as an early-bail-out in the parser so that, for instance,
/// if true {...} else {...}
/// |x| 5
/// isn't parsed as (if true {...} else {...} | x) | 5
pub fn expr_requires_semi_to_be_stmt(e: &ast::Expr) -> bool {
!matches!(
/// This classification determines whether various syntactic positions break out
/// of parsing the current expression (true) or continue parsing more of the
/// same expression (false).
///
/// For example, it's relevant in the parsing of match arms:
///
/// ```ignore (illustrative)
/// match ... {
/// // Is this calling $e as a function, or is it the start of a new arm
/// // with a tuple pattern?
/// _ => $e (
/// ^ )
///
/// // Is this an Index operation, or new arm with a slice pattern?
/// _ => $e [
/// ^ ]
///
/// // Is this a binary operator, or leading vert in a new arm? Same for
/// // other punctuation which can either be a binary operator in
/// // expression or unary operator in pattern, such as `&` and `-`.
/// _ => $e |
/// ^
/// }
/// ```
///
/// If $e is something like `{}` or `if … {}`, then terminate the current
/// arm and parse a new arm.
///
/// If $e is something like `path::to` or `(…)`, continue parsing the same
/// arm.
///
/// *Almost* the same classification is used as an early bail-out for parsing
/// statements. See `expr_requires_semi_to_be_stmt`.
pub fn expr_is_complete(e: &ast::Expr) -> bool {
matches!(
e.kind,
ast::ExprKind::If(..)
| ast::ExprKind::Match(..)
| ast::ExprKind::Block(..)
| ast::ExprKind::While(..)
| ast::ExprKind::Loop(..)
| ast::ExprKind::ForLoop { .. }
| ast::ExprKind::TryBlock(..)
| ast::ExprKind::ConstBlock(..)
If(..)
| Match(..)
| Block(..)
| While(..)
| Loop(..)
| ForLoop { .. }
| TryBlock(..)
| ConstBlock(..)
)
}

/// Does this expression require a semicolon to be treated as a statement?
///
/// The negation of this: "can this expression be used as a statement without a
/// semicolon" -- is used as an early bail-out when parsing statements so that,
/// for instance,
///
/// ```ignore (illustrative)
/// if true {...} else {...}
/// |x| 5
/// ```
///
/// isn't parsed as `(if true {...} else {...} | x) | 5`.
///
/// Surprising special case: even though braced macro calls like `m! {}`
/// normally do not introduce a boundary when found at the head of a match arm,
/// they do terminate the parsing of a statement.
///
/// ```ignore (illustrative)
/// match ... {
/// _ => m! {} (), // macro that expands to a function, which is then called
/// }
///
/// let _ = { m! {} () }; // macro call followed by unit
/// ```
pub fn expr_requires_semi_to_be_stmt(e: &ast::Expr) -> bool {
match &e.kind {
MacCall(mac_call) => mac_call.args.delim != Delimiter::Brace,
_ => !expr_is_complete(e),
}
}

/// If an expression ends with `}`, returns the innermost expression ending in the `}`
pub fn expr_trailing_brace(mut expr: &ast::Expr) -> Option<&ast::Expr> {
use ast::ExprKind::*;

loop {
match &expr.kind {
AddrOf(_, _, e)
Expand Down
2 changes: 1 addition & 1 deletion compiler/rustc_ast_pretty/src/pprust/state/expr.rs
Original file line number Diff line number Diff line change
Expand Up @@ -780,7 +780,7 @@ impl<'a> State<'a> {
}
_ => {
self.end(); // Close the ibox for the pattern.
self.print_expr(body, FixupContext::new_stmt());
self.print_expr(body, FixupContext::new_match_arm());
self.word(",");
}
}
Expand Down
57 changes: 52 additions & 5 deletions compiler/rustc_ast_pretty/src/pprust/state/fixup.rs
Original file line number Diff line number Diff line change
Expand Up @@ -49,6 +49,38 @@ pub(crate) struct FixupContext {
/// No parentheses required.
leftmost_subexpression_in_stmt: bool,

/// Print expression such that it can be parsed as a match arm.
///
/// This is almost equivalent to `stmt`, but the grammar diverges a tiny bit
/// between statements and match arms when it comes to braced macro calls.
/// Macro calls with brace delimiter terminate a statement without a
/// semicolon, but do not terminate a match-arm without comma.
///
/// ```ignore (illustrative)
/// m! {} - 1; // two statements: a macro call followed by -1 literal
///
/// match () {
/// _ => m! {} - 1, // binary subtraction operator
/// }
/// ```
match_arm: bool,

/// This is almost equivalent to `leftmost_subexpression_in_stmt`, other
/// than for braced macro calls.
///
/// If we have `m! {} - 1` as an expression, the leftmost subexpression
/// `m! {}` will need to be parenthesized in the statement case but not the
/// match-arm case.
///
/// ```ignore (illustrative)
/// (m! {}) - 1; // subexpression needs parens
///
/// match () {
/// _ => m! {} - 1, // no parens
/// }
/// ```
leftmost_subexpression_in_match_arm: bool,

/// This is the difference between:
///
/// ```ignore (illustrative)
Expand All @@ -68,6 +100,8 @@ impl Default for FixupContext {
FixupContext {
stmt: false,
leftmost_subexpression_in_stmt: false,
match_arm: false,
leftmost_subexpression_in_match_arm: false,
parenthesize_exterior_struct_lit: false,
}
}
Expand All @@ -76,13 +110,16 @@ impl Default for FixupContext {
impl FixupContext {
/// Create the initial fixup for printing an expression in statement
/// position.
///
/// This is currently also used for printing an expression as a match-arm,
/// but this is incorrect and leads to over-parenthesizing.
pub fn new_stmt() -> Self {
FixupContext { stmt: true, ..FixupContext::default() }
}

/// Create the initial fixup for printing an expression as the right-hand
/// side of a match arm.
pub fn new_match_arm() -> Self {
FixupContext { match_arm: true, ..FixupContext::default() }
}

/// Create the initial fixup for printing an expression as the "condition"
/// of an `if` or `while`. There are a few other positions which are
/// grammatically equivalent and also use this, such as the iterator
Expand All @@ -106,6 +143,9 @@ impl FixupContext {
FixupContext {
stmt: false,
leftmost_subexpression_in_stmt: self.stmt || self.leftmost_subexpression_in_stmt,
match_arm: false,
leftmost_subexpression_in_match_arm: self.match_arm
|| self.leftmost_subexpression_in_match_arm,
..self
}
}
Expand All @@ -119,7 +159,13 @@ impl FixupContext {
/// example the `$b` in `$a + $b` and `-$b`, but not the one in `[$b]` or
/// `$a.f($b)`.
pub fn subsequent_subexpression(self) -> Self {
FixupContext { stmt: false, leftmost_subexpression_in_stmt: false, ..self }
FixupContext {
stmt: false,
leftmost_subexpression_in_stmt: false,
match_arm: false,
leftmost_subexpression_in_match_arm: false,
..self
}
}

/// Determine whether parentheses are needed around the given expression to
Expand All @@ -128,7 +174,8 @@ impl FixupContext {
/// The documentation on `FixupContext::leftmost_subexpression_in_stmt` has
/// examples.
pub fn would_cause_statement_boundary(self, expr: &Expr) -> bool {
self.leftmost_subexpression_in_stmt && !classify::expr_requires_semi_to_be_stmt(expr)
(self.leftmost_subexpression_in_stmt && !classify::expr_requires_semi_to_be_stmt(expr))
|| (self.leftmost_subexpression_in_match_arm && classify::expr_is_complete(expr))
}

/// Determine whether parentheses are needed around the given `let`
Expand Down
27 changes: 27 additions & 0 deletions compiler/rustc_lint/src/unused.rs
Original file line number Diff line number Diff line change
Expand Up @@ -677,6 +677,33 @@ trait UnusedDelimLint {
}

// Check if LHS needs parens to prevent false-positives in cases like `fn x() -> u8 { ({ 0 } + 1) }`.
//
// FIXME: https://github.com/rust-lang/rust/issues/119426
// The syntax tree in this code is from after macro expansion, so the
// current implementation has both false negatives and false positives
// related to expressions containing macros.
//
// macro_rules! m1 {
// () => {
// 1
// };
// }
//
// fn f1() -> u8 {
// // Lint says parens are not needed, but they are.
// (m1! {} + 1)
// }
//
// macro_rules! m2 {
// () => {
// loop { break 1; }
// };
// }
//
// fn f2() -> u8 {
// // Lint says parens are needed, but they are not.
// (m2!() + 1)
// }
{
let mut innermost = inner;
loop {
Expand Down
34 changes: 29 additions & 5 deletions compiler/rustc_parse/src/parser/expr.rs
Original file line number Diff line number Diff line change
Expand Up @@ -497,8 +497,7 @@ impl<'a> Parser<'a> {

/// Checks if this expression is a successfully parsed statement.
fn expr_is_complete(&self, e: &Expr) -> bool {
self.restrictions.contains(Restrictions::STMT_EXPR)
&& !classify::expr_requires_semi_to_be_stmt(e)
self.restrictions.contains(Restrictions::STMT_EXPR) && classify::expr_is_complete(e)
}

/// Parses `x..y`, `x..=y`, and `x..`/`x..=`.
Expand Down Expand Up @@ -2691,8 +2690,33 @@ impl<'a> Parser<'a> {
let first_tok_span = self.token.span;
match self.parse_expr() {
Ok(cond)
// If it's not a free-standing expression, and is followed by a block,
// then it's very likely the condition to an `else if`.
// Try to guess the difference between a "condition-like" vs
// "statement-like" expression.
//
// We are seeing the following code, in which $cond is neither
// ExprKind::Block nor ExprKind::If (the 2 cases wherein this
// would be valid syntax).
//
// if ... {
// } else $cond
//
// If $cond is "condition-like" such as ExprKind::Binary, we
// want to suggest inserting `if`.
//
// if ... {
// } else if a == b {
// ^^
// }
//
// If $cond is "statement-like" such as ExprKind::While then we
// want to suggest wrapping in braces.
//
// if ... {
// } else {
// ^
// while true {}
// }
// ^
if self.check(&TokenKind::OpenDelim(Delimiter::Brace))
&& classify::expr_requires_semi_to_be_stmt(&cond) =>
{
Expand Down Expand Up @@ -3136,7 +3160,7 @@ impl<'a> Parser<'a> {
err
})?;

let require_comma = classify::expr_requires_semi_to_be_stmt(&expr)
let require_comma = !classify::expr_is_complete(&expr)
&& this.token != token::CloseDelim(Delimiter::Brace);

if !require_comma {
Expand Down
31 changes: 31 additions & 0 deletions tests/ui/lint/lint-unnecessary-parens.fixed
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,28 @@ pub fn parens_with_keyword(e: &[()]) -> i32 {
macro_rules! baz {
($($foo:expr),+) => {
($($foo),*)
};
}

macro_rules! unit {
() => {
()
};
}

struct One;

impl std::ops::Sub<One> for () {
type Output = i32;
fn sub(self, _: One) -> Self::Output {
-1
}
}

impl std::ops::Neg for One {
type Output = i32;
fn neg(self) -> Self::Output {
-1
}
}

Expand Down Expand Up @@ -94,4 +116,13 @@ fn main() {

let _a = baz!(3, 4);
let _b = baz!(3);

let _ = {
unit!() - One //~ ERROR unnecessary parentheses around block return value
} + {
unit![] - One //~ ERROR unnecessary parentheses around block return value
} + {
// FIXME: false positive. This parenthesis is required.
unit! {} - One //~ ERROR unnecessary parentheses around block return value
};
}
31 changes: 31 additions & 0 deletions tests/ui/lint/lint-unnecessary-parens.rs
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,28 @@ pub fn parens_with_keyword(e: &[()]) -> i32 {
macro_rules! baz {
($($foo:expr),+) => {
($($foo),*)
};
}

macro_rules! unit {
() => {
()
};
}

struct One;

impl std::ops::Sub<One> for () {
type Output = i32;
fn sub(self, _: One) -> Self::Output {
-1
}
}

impl std::ops::Neg for One {
type Output = i32;
fn neg(self) -> Self::Output {
-1
}
}

Expand Down Expand Up @@ -94,4 +116,13 @@ fn main() {

let _a = baz!(3, 4);
let _b = baz!(3);

let _ = {
(unit!() - One) //~ ERROR unnecessary parentheses around block return value
} + {
(unit![] - One) //~ ERROR unnecessary parentheses around block return value
} + {
// FIXME: false positive. This parenthesis is required.
(unit! {} - One) //~ ERROR unnecessary parentheses around block return value
};
}
Loading

0 comments on commit 8cc6f34

Please sign in to comment.