Inconsistent parsing of indented bullet lists/unordered lists #107

Viir · 2021-09-22T12:43:39Z

I was surprised by how version 7.0.0 maps this markdown string:

## elm-markdown 7.0.0

   + 0
   + 1
   + 2
   + 4

   - 0
   - 1
   - 2
   - 3

Here is a screenshot of parsed and rendered, based on one of the example ellies:

https://ellie-app.com/fnbWFCZhzyTa1

There are three spaces in front of each list item. Is it desirable/intended to have this distinction between minus/plus character?

The text was updated successfully, but these errors were encountered:

Viir · 2021-09-22T12:49:19Z

Reading about the goals of this markdown package in the readme, I guess we would look at https://github.github.com/gfm/#lists for a specification.

Since that is GitHub Flavored Markown, a faster way seems to add that string literally here to see how it renders in the comment:

elm-markdown 7.0.0

0
1
2
4

0
1
2
3

The rendering in the GitHub issue above looks like there is no difference between using minus and plus character for this list.

Viir · 2021-09-22T13:14:03Z

I went searching for a change in the program code that would fix this inconsistency.
I'd start that search by adding a test for the scenario from this thread.
But running elm-test-rs on 1b0b3dc (from the master branch) produces this error, no test results at all:

Running 1 tests. To reproduce these results later,
run elm-test-rs with --seed 3729730344 and --fuzz 100


    A test group contains multiple tests named 'list parsing'. Do some renaming so that tests have unique names.


TEST RUN FAILED

Duration: 1 ms
Passed:   0
Failed:   1

Viir · 2021-09-22T13:18:51Z

I found the test modules call them 'unordered lists' (not 'bullet lists'), so maybe the issue title is not ideal yet. 🤔

Viir · 2021-09-22T13:26:57Z

If the indent before the list items is to be discarded, following might be a appropriate implementation for an automated test:

        , test "basic indented list with '+'" <|
            \() ->
                String.trimLeft """
   + Item 1
   + Item 2
   + Item 3
"""
                    |> Markdown.Parser.parse
                    |> Expect.equal
                        (Ok
                            [ UnorderedList Tight
                                [ ListItem NoTask [ Paragraph [ Text "Item 1" ] ]
                                , ListItem NoTask [ Paragraph [ Text "Item 2" ] ]
                                , ListItem NoTask [ Paragraph [ Text "Item 3" ] ]
                                ]
                            ]
                        )

This test yields following output with elm-test-rs:

Running 1 tests. To reproduce these results later,
run elm-test-rs with --seed 1296081860 and --fuzz 100

↓ list parsing
✗ basic indented list with '+'

    Ok [UnorderedList Tight [ListItem NoTask [Paragraph [Text "Item 1\n + Item 2\n + Item 3"]]]]
    ╷
    │ Expect.equal
    ╵
    Ok [UnorderedList Tight [ListItem NoTask [Paragraph [Text "Item 1"]],ListItem NoTask [Paragraph [Text "Item 2"]],ListItem NoTask [Paragraph [Text "Item 3"]]]]


TEST RUN FAILED

Duration: 8 ms
Passed:   0
Failed:   1

Viir · 2021-09-22T13:33:33Z

I don't have a lot of experience with parsing, but rummaging through the parsing code, I stumbled on this function that looks like a good candidate for experimenting with a change:

elm-markdown/src/Markdown/UnorderedList.elm

Lines 38 to 48 in 1c0ca65

    
           unorderedListMarkerParser : Parser UnorderedListMarker 
        
           unorderedListMarkerParser = 
        
               oneOf 
        
                   [ succeed Minus 
        
                       |. Extra.upTo 3 Whitespace.space 
        
                       |. Advanced.symbol (Advanced.Token "-" (Parser.ExpectingSymbol "-")) 
        
                   , succeed Plus 
        
                       |. Advanced.symbol (Advanced.Token "+" (Parser.ExpectingSymbol "+")) 
        
                   , succeed Asterisk 
        
                       |. Advanced.symbol (Advanced.Token "*" (Parser.ExpectingSymbol "*")) 
        
                   ]

I tried this variant:

unorderedListMarkerParser : Parser UnorderedListMarker
unorderedListMarkerParser =
    oneOf
        [ succeed Minus
            |. Extra.upTo 3 Whitespace.space
            |. Advanced.symbol (Advanced.Token "-" (Parser.ExpectingSymbol "-"))
        , succeed Plus
            |. Extra.upTo 3 Whitespace.space
            |. Advanced.symbol (Advanced.Token "+" (Parser.ExpectingSymbol "+"))
        , succeed Asterisk
            |. Extra.upTo 3 Whitespace.space
            |. Advanced.symbol (Advanced.Token "*" (Parser.ExpectingSymbol "*"))
        ]

Guess that would have been too easy. I see no change in the test result.

My next guess is something else consumes that part with a higher priority.

Viir · 2021-09-22T14:29:01Z

Taking a break helped. I found an error in the test code. This one looks better:

        , test "basic list with '+' indented by 3" <|
            \() ->
                """
   + Item 1
   + Item 2
   + Item 3
"""
                    |> Markdown.Parser.parse
                    |> Expect.equal
                        (Ok
                            [ UnorderedList Tight
                                [ ListItem NoTask [ Paragraph [ Text "Item 1" ] ]
                                , ListItem NoTask [ Paragraph [ Text "Item 2" ] ]
                                , ListItem NoTask [ Paragraph [ Text "Item 3" ] ]
                                ]
                            ]
                        )

Viir · 2021-09-22T14:43:48Z

With more experimenting, I learned that I need to add backtracking here. This variant lets the test pass:

unorderedListMarkerParser : Parser UnorderedListMarker
unorderedListMarkerParser =
    oneOf
        [ backtrackable
            (succeed Minus
                |. Extra.upTo 3 Whitespace.space
                |. Advanced.symbol (Advanced.Token "-" (Parser.ExpectingSymbol "-"))
            )
        , succeed Plus
            |. Extra.upTo 3 Whitespace.space
            |. Advanced.symbol (Advanced.Token "+" (Parser.ExpectingSymbol "+"))
        , succeed Asterisk
            |. Extra.upTo 3 Whitespace.space
            |. Advanced.symbol (Advanced.Token "*" (Parser.ExpectingSymbol "*"))
        ]

Fixes issue dillonkearns#107

Viir changed the title ~~Case of inconsistent list parsing?~~ Inconsistent handling of unordered indented lists Sep 22, 2021

Viir changed the title ~~Inconsistent handling of unordered indented lists~~ Inconsistent handling of indented bullet lists Sep 22, 2021

Viir changed the title ~~Inconsistent handling of indented bullet lists~~ Inconsistent handling of indented bullet/unordered lists Sep 22, 2021

Viir changed the title ~~Inconsistent handling of indented bullet/unordered lists~~ Inconsistent handling of indented bullet lists/unordered lists Sep 22, 2021

Viir changed the title ~~Inconsistent handling of indented bullet lists/unordered lists~~ Inconsistent parsing of indented bullet lists/unordered lists Sep 22, 2021

Viir added a commit to Viir/elm-markdown that referenced this issue Sep 22, 2021

Fix parsing of indented unordered lists to match GFM

1ab7a00

Fixes issue dillonkearns#107

Viir mentioned this issue Sep 22, 2021

Fix parsing of indented unordered lists to match GFM #108

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Inconsistent parsing of indented bullet lists/unordered lists #107

Inconsistent parsing of indented bullet lists/unordered lists #107

Viir commented Sep 22, 2021 •

edited

Loading

Viir commented Sep 22, 2021 •

edited

Loading

Viir commented Sep 22, 2021

Viir commented Sep 22, 2021

Viir commented Sep 22, 2021 •

edited

Loading

Viir commented Sep 22, 2021 •

edited

Loading

Viir commented Sep 22, 2021

Viir commented Sep 22, 2021

Inconsistent parsing of indented bullet lists/unordered lists #107

Inconsistent parsing of indented bullet lists/unordered lists #107

Comments

Viir commented Sep 22, 2021 • edited Loading

Viir commented Sep 22, 2021 • edited Loading

elm-markdown 7.0.0

Viir commented Sep 22, 2021

Viir commented Sep 22, 2021

Viir commented Sep 22, 2021 • edited Loading

Viir commented Sep 22, 2021 • edited Loading

Viir commented Sep 22, 2021

Viir commented Sep 22, 2021

Viir commented Sep 22, 2021 •

edited

Loading

Viir commented Sep 22, 2021 •

edited

Loading

Viir commented Sep 22, 2021 •

edited

Loading

Viir commented Sep 22, 2021 •

edited

Loading