Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Distinguish between pseudo-class selectors and interpolation #8

Conversation

kensternberg-authentik
Copy link

Related Links:

Addresses Issue 7: tree-sitter parses complex declaration as pseudo-class instead of CSS custom property.

What

This commit places a guard clause around the scanner expression for detecting a pseudo-class selector colon. The guard clause ensures that the character prior to any { symbol was not the start of interpolation.

I have also re-run the generate under 0.24.5, so the parser, headr files, and generated grammar have been updated to the latest & greatest version. A tree-sitter.json file has also been generated, and the corresponding tree-sitter clause removed from package.json (automatically, by tooling).

Why

This declaration, which has been added to the test cases.

div {
  --#{$x}--left: var(--#{$y}--right);
}

As written, when PSEUDO_CLASS_SELECTOR_COLON is a valid possible symbol, the scanner clause for pseudo-class selector detection passes if a { is detected before encountering a } or ; symbol. The expression #{$y} in the test case meets that criteria, the lexer result is marked incorrectly, and the parser fails to parse the declaration correctly.

Adding a guard that the prior character seen was a # and that the { is not the start of a nested block identified by a pseudo-class but is, instead, part of an interpolation, fixes this issue.

Test steps

  1. Run the tests under the old clause and see this failure:
correct / expected / unexpected

  1. Interpolation on both sides of a declaration:

    (stylesheet
      (rule_set
        (selectors
          (tag_name))
        (block
          (declaration
            (ERROR)
            (property_name
              (identifier)
              (interpolation
                (variable))
              (identifier))
            (call_expression
              (function_name)
              (arguments
                (identifier)
                (interpolation
                  (variable))
                (identifier)))))))

  1. Run it with the new clause and there is no more failure. :-) In fact, all of the tests now pass.

# Related Links:

Addresses [Issue 7: tree-sitter parses complex declaration as pseudo-class instead of CSS custom
property](tree-sitter-grammars#7).

# What

This commit places a guard clause around the scanner expression for detecting a pseudo-class
selector colon.  The guard clause ensures that the character *prior to* any `{` symbol was not the
start of interpolation.

I have also re-run the generate under 0.24.5, so the parser, headr files, and generated grammar have
been updated to the latest & greatest version. A `tree-sitter.json` file has also been generated,
and the corresponding tree-sitter clause removed from `package.json` (automatically, by tooling).

# Why

This declaration, which has been [added to the test cases](./test/corpus/declarations.txt).

```
div {
  --#{$x}--left: var(--#{$y}--right);
}
```

As written, when `PSEUDO_CLASS_SELECTOR_COLON` is a valid possible symbol, the scanner clause for
pseudo-class selector detection passes if a `{` is detected before encountering a `}` or `;` symbol.
The expression `#{$y}` in the test case meets that criteria, the lexer result is marked incorrectly,
and the parser fails to parse the declaration correctly.

Adding a guard that the prior character seen was a `#` and that the `{` is not the start of a nested
block identified by a pseudo-class but is, instead, part of an interpolation, fixes this issue.

# Test steps

1. Run the tests under the old clause and see this failure:

```
correct / expected / unexpected

  1. Interpolation on both sides of a declaration:

    (stylesheet
      (rule_set
        (selectors
          (tag_name))
        (block
          (declaration
            (ERROR)
            (property_name
              (identifier)
              (interpolation
                (variable))
              (identifier))
            (call_expression
              (function_name)
              (arguments
                (identifier)
                (interpolation
                  (variable))
                (identifier)))))))

```

2. Run it with the new clause and there is no more failure.  :-)  In fact, *all* of the tests now
   pass.
@@ -49,16 +49,6 @@
"parse": "tree-sitter parse",
"test": "tree-sitter test"
},
"tree-sitter": [

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change made by tree-sitter generate version 0.24.

while (lexer->lookahead != ';' && lexer->lookahead != '}' && !lexer->eof(lexer)) {
// We need a { to be a pseudo class selector, a ; indicates a
// property
while (lexer->lookahead != ';' && lexer->lookahead != '}' &&

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is where the actual fix begins; here I initialize a boolean to track the prior character, record it if necessary, and ignore the problematic { if possible_interpolation is still true. I then remove the possible_interpolation if the follow-on character was not a problematic {.

Man, I hate going back and writing C again.

@@ -0,0 +1,37 @@
{

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added by tree-sitter generate with version 0.24.

@@ -19,17 +19,24 @@ void tree_sitter_scss_external_scanner_destroy(void *payload) {}

void tree_sitter_scss_external_scanner_reset(void *payload) {}

unsigned tree_sitter_scss_external_scanner_serialize(void *payload, char *buffer) { return 0; }
unsigned tree_sitter_scss_external_scanner_serialize(void *payload,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most of this is clang_format and whatever the original code was formatted with disagreeing. The only syntactical change is marked below.

Copy link

@tanberry tanberry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving because I trust Ken with code and my life.

@kensternberg-authentik
Copy link
Author

Approving because I trust Ken with code and my life.

Hah! Thanks! And although this works, trying it out on the Patternfly source code revealed a whole bunch of edge cases, so I'm going to have to withdraw it and try again. It`s tree-sitter vs IBM. "Whose cuisine will reign supreme!?"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants