Changes to parser to pass all fixture/good tests #32

mladedav · 2022-01-30T16:57:21Z

There are a lot of changes to the parsing and I admit some of them are not ideal, but they have precedent in the data. I am kind of stumped that Gherkin does not have some kind of official regex and even the specification itself does not clearly state all things that should be parsed but I know I'm barking at the wrong tree here.

This includes changes in #31.

To pick a few of the worst things, as per gherkin documentation about Description: You can write anything you like, as long as no line starts with a keyword. which however breaks current smoke2 test which starts feature description lines with * and but. A case could be made that each description (and there can be descriptions for Feature, Example/Scenario, Background, Scenario Outline and Rule) could be made context-aware in the sense that only the subset of keywords that may follow the given described keyword would be considered as ending the description.

Another sub-par thing is that an empty file (or a file with just comments) is considered valid feature file, for which I had to return default feature, i.e. no rules, no steps, empty name. Returning Option<Feature> would be probably better, but it would require non-trivial changes as well as changes to downstream services (i.e. cucumber).

Also steps may now start with a keyword without a space, e.g. Butterscotch would now be a valid step parsed as But: terscotch. This is because of a fixture with the emoji language. It turns out people using emojis seem to not use spaces and I didn't feel like introducing logic to require spaces only for this language.

I kind of like the experience with peg I'm getting so I would love to help some more (maybe getting rust cucumber to be semi-official implementation) so if there's anything I would love to know.

ilslv · 2022-01-31T07:33:36Z

@mladedav

I am kind of stumped that Gherkin does not have some kind of official regex and even the specification itself does not clearly state all things that should be parsed but I know I'm barking at the wrong tree here.

Yeah, thats really unfortunate, we suffer from this all the time 😔

as per gherkin documentation about Description: You can write anything you like, as long as no line starts with a keyword.

To be fair I don't really trust Gherkin reference on cucumber.io as it's pretty outdated and skips some caveats for simplicity. Thats ok for general user, but doesn't work for implementors.

A case could be made that each description (and there can be descriptions for Feature, Example/Scenario, Background, Scenario Outline and Rule) could be made context-aware in the sense that only the subset of keywords that may follow the given described keyword would be considered as ending the description.

I like that way more. So instead of using general purpose description() parsing rule for all descriptions, we should make special parsing rule for each one.

Another sub-par thing is that an empty file (or a file with just comments) is considered valid feature file, for which I had to return default feature, i.e. no rules, no steps, empty name. Returning Option would be probably better, but it would require non-trivial changes as well as changes to downstream services (i.e. cucumber).

👍

Also steps may now start with a keyword without a space, e.g. Butterscotch would now be a valid step parsed as But: terscotch. This is because of a fixture with the emoji language. It turns out people using emojis seem to not use spaces and I didn't feel like introducing logic to require spaces only for this language.

Actually, I don't think you are right. The way this is handled is that space is a part of keyword. Emoji language definition clearly shows that * has a trailing space and 🙏 doesn't

src/parser.rs

mladedav · 2022-01-31T10:08:15Z

Actually, I don't think you are right. The way this is handled is that space is a part of keyword. Emoji language definition clearly shows that * has a trailing space and 🙏 doesn't

You're right, the issue why this did not work is that this repo has possibly outdated languages.json without the spaces. With them this works properly.

mladedav · 2022-01-31T14:38:25Z

I added the description excludes per keyword. Maybe some could be contracted (e.g. scenario and scenario outline) into a single function, but I decided to play it safely. I also added just the ones that seemed logical (e.g. scenario has to have steps---the data seem like that's true but I would not be surprised if it wasn't).

I had some troubles with the borrow checker (as I do in virtually every pull request I open) and I ended up cloning the keywords iterable in the keyword rule because of lifetimes. If there is another way to do this that would be great. The only thing that came to mind was creating arrays of keywords to be skipped through build.rs same as the Keyword structures are made, but that felt too cumbersome.

I also noticed that this should fix #10.

I will hopefully look into validating the parsing results from the fixtures because as it is now it could theoretically just parse the top level Feature keyword and treat everything else as description and we would never know (except for the parts that are tested manually), so until that is done some of these changes could be considered a bit dangerous.

tyranron · 2022-02-16T08:25:14Z

ping @mladedav

Are you going to continue work on this, or we should pick it up?

mladedav · 2022-02-21T13:12:08Z

Sorry I didn't get back to you on this sooner.

I plan to finish this, I hope to get some work here done during the week.

mladedav · 2022-03-13T02:03:17Z

Sorry it took longer than I had hoped, but I am fairly happy with the extent that the AST is checked against the fixtures. It uncovered a few bugs concerning which keywords can be used in a description so I guess it was worthwhile.

The AST parsing checks just backgrounds, scenarios, and rules. There are no checks for stuff like tags.

ilslv · 2022-03-14T05:40:07Z

@mladedav so this PR is ready for review?

mladedav · 2022-03-14T12:02:13Z

Yes, it is ready.

tyranron

@mladedav thanks 🍻

tyranron · 2022-03-28T16:12:55Z

@mladedav released in 0.12.0.

tyranron · 2022-03-29T12:07:28Z

@mladedav released in cucumber 0.13.0.

ilslv reviewed Jan 31, 2022

View reviewed changes

src/parser.rs Outdated Show resolved Hide resolved

ilslv added the enhancement Improvement of existing features or bugfix label Jan 31, 2022

ilslv added this to the 0.12.0 milestone Jan 31, 2022

ilslv assigned mladedav Jan 31, 2022

Update languages.json

c7b5c93

mladedav force-pushed the feature/fixture branch 2 times, most recently from 77cd3ce to 9601deb Compare January 31, 2022 14:29

mladedav marked this pull request as draft January 31, 2022 14:40

mladedav added 3 commits March 14, 2022 08:35

Change parser to accomodate for all fixture good data

73e1c32

Add fixture tests and ast parsing

bbc2b20

Add support for multiple features and small fixes to grammar

22814e3

mladedav force-pushed the feature/fixture branch from 47f830d to 22814e3 Compare March 14, 2022 07:35

mladedav marked this pull request as ready for review March 14, 2022 07:36

tyranron requested a review from ilslv March 14, 2022 08:06

ilslv added 2 commits March 25, 2022 14:34

Merge branch 'main' into feature/fixture

2666cef

Corrections

0402f8f

ilslv approved these changes Mar 28, 2022

View reviewed changes

ilslv requested a review from tyranron March 28, 2022 08:47

Minor corrections

df9df12

tyranron approved these changes Mar 28, 2022

View reviewed changes

tyranron mentioned this pull request Mar 28, 2022

Feature/Scenario description can only be 1 line #10

Closed

tyranron merged commit b19fa8d into cucumber-rs:main Mar 28, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Changes to parser to pass all fixture/good tests #32

Changes to parser to pass all fixture/good tests #32

mladedav commented Jan 30, 2022

ilslv commented Jan 31, 2022

mladedav commented Jan 31, 2022

mladedav commented Jan 31, 2022 •

edited

Loading

tyranron commented Feb 16, 2022

mladedav commented Feb 21, 2022

mladedav commented Mar 13, 2022

ilslv commented Mar 14, 2022

mladedav commented Mar 14, 2022

tyranron left a comment

tyranron commented Mar 28, 2022

tyranron commented Mar 29, 2022

Changes to parser to pass all fixture/good tests #32

Changes to parser to pass all fixture/good tests #32

Conversation

mladedav commented Jan 30, 2022

ilslv commented Jan 31, 2022

mladedav commented Jan 31, 2022

mladedav commented Jan 31, 2022 • edited Loading

tyranron commented Feb 16, 2022

mladedav commented Feb 21, 2022

mladedav commented Mar 13, 2022

ilslv commented Mar 14, 2022

mladedav commented Mar 14, 2022

tyranron left a comment

Choose a reason for hiding this comment

tyranron commented Mar 28, 2022

tyranron commented Mar 29, 2022

mladedav commented Jan 31, 2022 •

edited

Loading