Skip to content

Commit

Permalink
fix #33: also replace LaTeX math expressions with unique tokens for H…
Browse files Browse the repository at this point in the history
…TML output, so the actual expressions won't be seen by the renderer, therefore they will never be interpreted as Markdown

previously this was done only for LaTeX output, and we need to make LaTeX math more robust in HTML output, too
  • Loading branch information
yihui committed Oct 5, 2024
1 parent c17f4ac commit c5d40d1
Show file tree
Hide file tree
Showing 5 changed files with 67 additions and 26 deletions.
2 changes: 1 addition & 1 deletion DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
Package: litedown
Type: Package
Title: A Lightweight Version of R Markdown
Version: 0.2.8
Version: 0.2.9
Authors@R: c(
person("Yihui", "Xie", role = c("aut", "cre"), email = "[email protected]", comment = c(ORCID = "0000-0003-0645-5666")),
person()
Expand Down
2 changes: 2 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,8 @@

- Added support for an array of multiple authors in the YAML metadata (thanks, @AlbertLei, #28). If the `author` field in YAML is an array of length > 1, each author will be written to a separate `<h2>` in HTML output, or concatenated by `\and` in LaTeX output. Note that you can also write multiple authors in a single string (e.g., `author: "Jane X and John Y"`) instead of using an array (`author: ["Jane X", "John Y"]`), in which case the string will be treated as a single author (they will be put inside a single `<h2>` in HTML output).

- Fixed the bug that the leading `-`, `+`, or `*` in a LaTeX math expression was recognized as the bullet list marker, which would invalidate the math expression (thanks, @hturner, #33).

# CHANGES IN litedown VERSION 0.2

- A data frame (or matrix/tibble) wrapped in `I()` is fully printed to a table now by default. Without `I()`, data objects are truncated to 10 rows by default when printing to tables.
Expand Down
38 changes: 15 additions & 23 deletions R/mark.R
Original file line number Diff line number Diff line change
Expand Up @@ -123,17 +123,17 @@ mark = function(input, output = NULL, text = NULL, options = NULL, meta = list()
if (has_math <- test_feature('latex_math', '[$]')) {
id = id_string(text); maths = NULL
text = xfun::protect_math(text, id)
# temporarily replace math expressions with tokens and restore them later;
# no need to do this for html output because we need special HTML characters
# like &<> in math expressions to be converted to entities, but shouldn't
# convert them for latex output
if (format == 'latex') {
if (has_math <- any(grepl(paste0('`', id), text, fixed = TRUE))) {
# temporarily replace math expressions with tokens so render() won't seem
# them (to avoid issues like #33) and restore them later
text = one_string(text)
text = match_replace(text, sprintf('`%s.{3,}?%s`', id, id), function(x) {
text = match_replace(text, sprintf('`%s(?s).{3,}?%s`', id, id), function(x) {
n0 = length(maths)
maths <<- c(maths, gsub(sprintf('`%s|%s`', id, id), '', x))
# replace math with !id-n-id! where n is the index of the math
sprintf('!%s-%d-%s!', id, length(maths) + seq_along(x), id)
sprintf('!%s-%d-%s!', id, n0 + seq_along(x), id)
})
if (format == 'html') maths = xfun::html_escape(maths)
text = split_lines(text)
}
}
Expand Down Expand Up @@ -207,15 +207,17 @@ mark = function(input, output = NULL, text = NULL, options = NULL, meta = list()
ret = render(text)
ret = move_attrs(ret, format) # apply attributes of the form {attr="value"}

if (has_math) ret = match_replace(ret, sprintf('!%s-\\d+-%s!', id, id), function(x) {
if (length(maths) != length(x)) warning(
'LaTeX math expressions cannot be restored correctly (expected ',
length(maths), ' expression(s) but found ', length(x), ' in the output).'
)
maths
})

if (format == 'html') {
# don't disable check boxes
ret = gsub('(<li><input type="checkbox" [^>]*?)disabled="" (/>)', '\\1\\2', ret)
if (has_math) {
ret = gsub(sprintf('<code>%s(.{5,}?)%s</code>', id, id), '\\1', ret)
# `\(math\)` may fail to render to <code>\(math\)</code> when backticks
# are inside HTML tags, e.g., commonmark::markdown_html('<p>`a`</p>')
ret = gsub(sprintf('`%s\\\\\\((.+?)\\\\\\)%s`', id, id), '$\\1$', ret)
}
if (has_sup)
ret = gsub(sprintf('!%s(.+?)%s!', id2, id2), '<sup>\\1</sup>', ret)
if (has_sub)
Expand Down Expand Up @@ -262,16 +264,6 @@ mark = function(input, output = NULL, text = NULL, options = NULL, meta = list()
ret = number_refs(ret, r_ref)
} else if (format == 'latex') {
ret = render_footnotes(ret) # render [^n] footnotes
if (has_math) {
m = gregexpr(sprintf('!%s-(\\d+)-%s!', id, id), ret)
regmatches(ret, m) = lapply(regmatches(ret, m), function(x) {
if (length(maths) != length(x)) warning(
'LaTeX math expressions cannot be restored correctly (expected ',
length(maths), ' expressions but found ', length(x), ' in the output).'
)
maths
})
}
if (has_sup)
ret = gsub(sprintf('!%s(.+?)%s!', id2, id2), '\\\\textsuperscript{\\1}', ret)
if (has_sub)
Expand Down
46 changes: 45 additions & 1 deletion docs/02-syntax.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -52,7 +52,7 @@ for HTML output (see the next section).
### LaTeX math

You can write both `$inline$` and `$$display$$` LaTeX math, e.g.,
$\sin^{2}(\theta)+\cos^{2}(\theta) = 1$
$\sin^{2}(\theta)+\cos^{2}(\theta) = 1$.

$$\bar{X} = \frac{1}{n} \sum_{i=1}^n X_i$$

Expand All @@ -61,6 +61,50 @@ x &\text{if } x \geq 0 \\
-x &\text{if } x < 0
\end{cases}$$

For expressions in pairs of single or double dollar signs to be recognized as
LaTeX math, there must be no spaces after the opening dollar sign, or before the
closing dollar sign. There should be at least one space before the opening
dollar sign, unless the math expression starts from the very beginning of a
line.

- For a pair of single dollar signs, they must be on the same line in the
text, and the closing dollar sign should not be followed by a number (to
avoid detecting math mistakenly from text like "a \$5 bill and a \$10
bill").

- For `$$ $$` expressions, they can span over multiple lines, in which case
the closing `$$` must have at least one non-space character before it, and
no spaces after it.

Valid examples:

``` md
$x + y$
$x + y$
text $x + y$ text

$$x + y$$
$$x + y$$
text $$x + y$$ text
$$x +
y$$
```

Invalid examples:

``` md
$ x + y$ (space after the opening `$`)
text$x + y$ (lack of space before the opening `$`)
text $x + y$10 text (number after closing `$`)
$x +
y$ (`$ $` expressions cannot be written on multiple lines)

$$x +
y
$$
(lack of non-space character before closing `$$`)
```

LaTeX math environments are also supported, e.g., below are an `align`
environment and an `equation` environment:

Expand Down
5 changes: 4 additions & 1 deletion tests/tests.Rout.save
Original file line number Diff line number Diff line change
Expand Up @@ -63,7 +63,10 @@ bar</p>
<p><code>$x$</code> is inline math \(x\)!</p>
<p>Display style:</p>
<p>$$x + y$$</p>
<p>\begin{align} a^{2}+b^{2} &amp; = c^{2}\\ \sin^{2}(x)+\cos^{2}(x) &amp; = 1 \end{align}</p>
<p>\begin{align}
a^{2}+b^{2} &amp; = c^{2}\\
\sin^{2}(x)+\cos^{2}(x) &amp; = 1
\end{align}</p>

> mark(mkd, options = "-latex_math")
<p><code>$x$</code> is inline math $x$!</p>
Expand Down

0 comments on commit c5d40d1

Please sign in to comment.