Inline HTML Code Blocks #50

hayesgm · 2020-07-13T05:49:44Z

I'm trying to use the extensible aspect of elm-markdown to add a  tag without forking the library.

For example:

Hello <sup>world</sup>!

This is the standard GFM style,^{or so I've been told}. I added a custom HTML extension as such:

htmlRender : Markdown.Html.Renderer (List (Html msg) -> Html msg)
htmlRender =
    Markdown.Html.tag "sup"
        (\renderedChildren ->
            Html.sup [] renderedChildren
        )

However, this is being parsed and rendered as:

Paragraph [Text "Hello"],
HtmlBlock (HtmlElement "sup" [] [Paragraph [Text "world"]]),
Paragraph [Text "!"]

<p>Hello</p>
<sup><p>world</p></sup>
<p>!</p>

I am not sure this falls under standard GFM failures as it depends largely on the specifically extensibility features of this library. My expectation is that this would have been parsed and rendered as:

Paragraph
  [ Text "Hello"
  , HtmlElement "sup" [] [Text "world"]
  , Text "!"
  ]

<p>Hello <sup>world</sup>!</p>

However, it appears in the current parsing data types (Markdown.Block), this data cannot be represented directly (e.g. we cannot nest an HtmlBlock below a Paragraph). Would the right approach here be to fork and add sup directly to the parser? At the end of the day, this limitation makes adding new extensible features a little difficult since they must always be top-level blocks.

Thanks for making an awesome, pure Elm parser!

The text was updated successfully, but these errors were encountered:

dillonkearns · 2020-07-14T01:11:31Z

Hello! It's probably worth clarifying what the defaultHtmlRenderer does in this case and how it compares to standard GFM.

My initial thought is that it could take in an explicit set of html handlers. Right now, it doesn't accept any HTML tags. But that's just that particular function, the library itself is fully extensible in you can handle HTML tags. Here's an example that builds off of the defaultHtmlRenderer and adds a  tag.

https://ellie-app.com/9qnwgJFwWLha1

Does that do what you're looking for? Let me know how that goes. I think it's worth documenting which HTML tags (if any) are rendered in the default HTML renderer, and considering adding an explicit argument of HTML handlers, or perhaps accepted tags to render directly as HTML.

hayesgm · 2020-07-14T02:24:14Z

Yes, that sample is similar to what I had been building, and my only real concern there is that it renders as <ul><li>22</li></ul>, and that extra Html.p tag is impossible to remove (since you can't effectively edit Elm's HTML msg types as the constructors are not exposed to pattern match over, as far as I know). The solution may just be to make sure the css treats the paragraph as an inline element.

Additionally, I think I surfaced another issue diagnosing this one. Specifically, if I wrap this ul in a custom <section> tag and that ul list items contain a custom HTML tag (e.g. ), we end up rendering two separate uls. For example, with the same Elm code, adding a section tag like below:

https://ellie-app.com/9qpztJSgwhWa1

<section>
* 1
* 2<sup>2</sup>
* 2
</section>

This unexpectedly renders to:

<section>
  <ul>
    <li>1</li>
    <li>2</li>
  </ul>
  <sup><p>2</p></sup>
  <ul>
    <li>2</li>
  </ul>
</section>

Notice the two ul tags. The expectation is that we should have one ul with three list items. The parser, in fact, produces two UnorderedLists when it should only have produced one in this case.

By the way, thanks for your quick reply, and happy to help dive in to solve these issues!

hayesgm · 2020-07-14T08:06:50Z

Digging into this a bit, when we parse a node like section in HtmlParser, we end up breaking out the children as such:

Element "section" [] [Text "\n* 1", Text "\n\n* 2",Element "sup" [] [Text "2"],Text "\n* 3\n\n"]

Since that element breaks the unordered list when parsing blocks, whereas usually those elements would be a single unordered list and the html would be parsed as an inline.

The solution would be to follow the GFM Spec 4.6 Subsection 6 "end condition" and break the HTML parsing after a blank line (as demonstrated in Example 118 which is failing in the current specs). That said, as you prefer good error handling instead of full correctness, I'm not sure if that approach is what you want (since we would do a worse job at explaining mismatched tags, etc). Or, alternatively, we could continue parsing full HTML but recombine the nodes back into larger text nodes after the HTML parse step. Or we could punt on this issue entirely.

For what it's worth, this comes up in my project since I want to wrap large <section>s with custom HTML. The problem is that this is causing parsing to break in strange ways (like this example).

supermario · 2020-08-05T07:16:57Z

I've run into this same issue, in my case meaning I'm unable to use this package as I need lots of rich inline HTML.

Example:

Becomes

Note the way strong/** and emphasis/_ parsers works fine – I've tried to use them instead to achieve the styling I need but unfortunately there are only a few and no way to define more it seems 😅

dillonkearns · 2020-08-05T15:08:14Z

Yeah, it's been on my radar that I'll need some way to take in raw text. Or at least that's one solution.

Current Behavior

For example, the current behavior is :

Markdown.Html.tag "sup"
                (\renderedChildren -> Html.sup [] renderedChildren)
            ]

<p>2<sup><p>2</p></sup></p>

Option 1

We could add a function that gives you direct access to the raw text within the HTML, like this:

Markdown.Html.tagWithRawText "sup"
                (\rawText -> Html.sup [] [ Html.text rawText ])
            ]

<p>2<sup>2</sup></p>

Option 2

I guess the other way to approach it could be to not wrap the rendered children in a paragraph tag.

Markdown.Html.tag "sup"
                (\renderedChildren -> Html.sup [] renderedChildren)
            ]

<p>2<sup>2</sup></p>

There are a lot of implications to think through for that case. It may turn out that the semantics are more intuitive and fit into the spec better. Or it may be that Option 2 isn't viable because it conflicts with the Markdown spec.

I'd be curious to hear thoughts on that, though. Any other possibilities? Any initial impressions of those two options?

supermario · 2020-08-05T18:24:32Z

@dillonkearns not sure if I'm just not understanding your proposed options, but it seems to me that nested tags should be parsed within their context, rather than always breaking out to the top level.

I think @hayesgm articulated it nicely in his first post – it seems that any parsed HtmlBlock lifts all the way up to the top (as evidenced by the  example becoming a top level HtmlBlock).

Would it be possible to have it not do that, and nest just like * and _ usages do? Or is there some conflict with the HTML parser that wouldn't allow parsing nested HTML? I could see how perhaps there are finicky contexts with nested HTML across multiple lines, but personally I think single-line nested HTML would be a great start.

Edit: re-reading the options you've proposed, it seems you're addressing the children of . I'm not sure this is the issue – the contents seem okay, the problem I'm witnessing is the entire tag gets pulled up to the top level, i.e. the following:

# This should be <sup>one</sup> line!

Gets parsed as if you had typed

# This should be
<sup>one</sup>
line!

hayesgm · 2020-08-10T07:39:27Z

I believe there are actually two issues going on here:

Issue 1 - Paragraph Wrapping

renderedChildren are always wrapped in a , which can break certain renders.

Example: * 22
Renders to Hi there.

The paragraph tag inside of  breaks rendering without custom styling (e.g. sup p { display: inline; }).

Issue 2 - Invalid Parsing of Nested HTML Tags

When nesting HTML tags, we parse HTML too early and significantly break rendering.

Example:

<section>
* 1
* 2<sup>2</sup>
* 3
</section>

Renders to:

<section>
  <ul>
    <li>1</li>
    <li>2</li>
  </ul>
  <sup><p>2</p></sup>
  <ul>
    <li>3</li>
  </ul>
</section>

Notice the two uls that are created. This is because when we're parsing HTML, we parse the tags and then render the contents separately, which lead to this error. The hard part about fixing this issue is that Markdown solves tags by basically ignoring tag-matching. For us, getting smart tag matching mixed with Markdown is going to be tricky (but worth it, since this library, I believe, encourage you to use smart custom tags).

mdgriffith mentioned this issue Oct 21, 2020

Inline html turning into it's own block when it starts a line. #70

Open

allenap mentioned this issue Sep 4, 2021

Indent is significant within HTML blocks #105

Open

tad-lispy mentioned this issue Dec 16, 2021

Run inline parser within Inline HTML #100

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Inline HTML Code Blocks #50

Inline HTML Code Blocks #50

hayesgm commented Jul 13, 2020

dillonkearns commented Jul 14, 2020

hayesgm commented Jul 14, 2020

hayesgm commented Jul 14, 2020

supermario commented Aug 5, 2020 •

edited

Loading

dillonkearns commented Aug 5, 2020

supermario commented Aug 5, 2020 •

edited

Loading

hayesgm commented Aug 10, 2020

Inline HTML Code Blocks #50

Inline HTML Code Blocks #50

Comments

hayesgm commented Jul 13, 2020

dillonkearns commented Jul 14, 2020

hayesgm commented Jul 14, 2020

hayesgm commented Jul 14, 2020

supermario commented Aug 5, 2020 • edited Loading

dillonkearns commented Aug 5, 2020

Current Behavior

Option 1

Option 2

supermario commented Aug 5, 2020 • edited Loading

hayesgm commented Aug 10, 2020

Issue 1 - Paragraph Wrapping

Issue 2 - Invalid Parsing of Nested HTML Tags

supermario commented Aug 5, 2020 •

edited

Loading

supermario commented Aug 5, 2020 •

edited

Loading