Skip to content

Latest commit

 

History

History
271 lines (175 loc) · 18.8 KB

Links.md

File metadata and controls

271 lines (175 loc) · 18.8 KB

{{($page.frontmatter.start = 493) ? null : null}}

Links

A link contains link text (the visible text), a link destination (the URI that is the link destination), and optionally a link title. There are two basic kinds of links in Markdown. In inline links the destination and title are given immediately after the link text. In reference links the destination and title are defined elsewhere in the document.
link text consists of a sequence of zero or more inline elements enclosed by square brackets ([ and ]). The following rules apply:

  • Links may not contain other links, at any level of nesting. If multiple otherwise valid link definitions appear nested inside each other, the inner-most definition is used.
  • Brackets are allowed in the link text only if (a) they are backslash-escaped or (b) they appear as a matched pair of brackets, with an open bracket [, a sequence of zero or more inlines, and a close bracket ].
  • Backtick code spansautolinks, and raw HTML tags bind more tightly than the brackets in link text. Thus, for example, [foo`]`  could not be a link text, since the second ] is part of a code span.
  • The brackets in link text bind more tightly than markers for emphasis and strong emphasis. Thus, for example, *[foo*](url) is a link.

link destination consists of either

  • a sequence of zero or more characters between an opening < and a closing > that contains no spaces, line breaks, or unescaped < or > characters, or
  • a nonempty sequence of characters that does not start with <, does not include ASCII space or control characters, and includes parentheses only if (a) they are backslash-escaped or (b) they are part of a balanced pair of unescaped parentheses.(Implementations may impose limits on parentheses nesting to avoid performance issues, but at least three levels of nesting should be supported.)

link title consists of either

  • a sequence of zero or more characters between straight double-quote characters ("), including a "character only if it is backslash-escaped, or
  • a sequence of zero or more characters between straight single-quote characters ('), including a 'character only if it is backslash-escaped, or
  • a sequence of zero or more characters between matching parentheses ((...)), including a ( or ) character only if it is backslash-escaped.

Although link titles may span multiple lines, they may not contain a blank line.
An inline link consists of a link text followed immediately by a left parenthesis (, optional whitespace, an optional link destination, an optional link title separated from the link destination by whitespace, optional whitespace, and a right parenthesis ). The link’s text consists of the inlines contained in the link text(excluding the enclosing square brackets). The link’s URI consists of the link destination, excluding enclosing <...> if present, with backslash-escapes in effect as described above. The link’s title consists of the link title, excluding its enclosing delimiters, with backslash-escapes in effect as described above.
Here is a simple inline link:

The title may be omitted:

Both the title and the destination may be omitted:

The destination can only contain spaces if it is enclosed in pointy brackets:

The destination cannot contain line breaks, even if enclosed in pointy brackets:

The destination can contain ) if it is enclosed in pointy brackets:

Pointy brackets that enclose links must be unescaped:

These are not links, because the opening pointy bracket is not matched properly:

Parentheses inside the link destination may be escaped:

Any number of parentheses are allowed without escaping, as long as they are balanced:

However, if you have unbalanced parentheses, you need to escape or use the <...> form:

Parentheses and other symbols can also be escaped, as usual in Markdown:

A link can contain fragment identifiers and queries:

Note that a backslash before a non-escapable character is just a backslash:

URL-escaping should be left alone inside the destination, as all URL-escaped characters are also valid URL characters. Entity and numerical character references in the destination will be parsed into the corresponding Unicode code points, as usual. These may be optionally URL-escaped when written as HTML, but this spec does not enforce any particular policy for rendering URLs in HTML or other formats. Renderers may make different decisions about how to escape or normalize URLs in the output.

Note that, because titles can often be parsed as destinations, if you try to omit the destination and keep the title, you’ll get unexpected results:

Titles may be in single quotes, double quotes, or parentheses:

Backslash escapes and entity and numeric character references may be used in titles:

Titles must be separated from the link using a whitespace. Other Unicode whitespace like non-breaking space doesn’t work.

Nested balanced quotes are not allowed without escaping:

But it is easy to work around this by using a different quote type:

(Note: Markdown.pl did allow double quotes inside a double-quoted title, and its test suite included a test demonstrating this. But it is hard to see a good rationale for the extra complexity this brings, since there are already many ways—backslash escaping, entity and numeric character references, or using a different quote type for the enclosing title—to write titles containing double quotes. Markdown.pl’s handling of titles has a number of other strange features. For example, it allows single-quoted titles in inline links, but not reference links. And, in reference links but not inline links, it allows a title to begin with " and end with ).Markdown.pl 1.0.1 even allows titles with no closing quotation mark, though 1.0.2b8 does not. It seems preferable to adopt a simple, rational rule that works the same way in inline links and link reference definitions.)
Whitespace is allowed around the destination and title:

But it is not allowed between the link text and the following parenthesis:

The link text may contain balanced brackets, but not unbalanced ones, unless they are escaped:

The link text may contain inline content:

However, links may not contain other links, at any level of nesting.

These cases illustrate the precedence of link text grouping over emphasis grouping:

Note that brackets that aren’t part of links do not take precedence:

These cases illustrate the precedence of HTML tags, code spans, and autolinks over link grouping:

There are three kinds of reference links: fullcollapsed, and shortcut.
full reference link consists of a link text immediately followed by a link label that matches a link reference definition elsewhere in the document.
link label begins with a left bracket ([) and ends with the first right bracket (]) that is not backslash-escaped. Between these brackets there must be at least one non-whitespace character. Unescaped square bracket characters are not allowed inside the opening and closing square brackets of link labels. A link label can have at most 999 characters inside the square brackets.
One label matches another just in case their normalized forms are equal. To normalize a label, strip off the opening and closing brackets, perform the Unicode case fold, strip leading and trailing whitespace and collapse consecutive internal whitespace to a single space. If there are multiple matching reference link definitions, the one that comes first in the document is used. (It is desirable in such cases to emit a warning.)
The contents of the first link label are parsed as inlines, which are used as the link’s text. The link’s URI and title are provided by the matching link reference definition.
Here is a simple example:

The rules for the link text are the same as with inline links. Thus:
The link text may contain balanced brackets, but not unbalanced ones, unless they are escaped:

The link text may contain inline content:

However, links may not contain other links, at any level of nesting.

(In the examples above, we have two shortcut reference links instead of one full reference link.)
The following cases illustrate the precedence of link text grouping over emphasis grouping:

These cases illustrate the precedence of HTML tags, code spans, and autolinks over link grouping:

Matching is case-insensitive:

Unicode case fold is used:

Consecutive internal whitespace is treated as one space for purposes of determining matching:

No whitespace is allowed between the link text and the link label:

This is a departure from John Gruber’s original Markdown syntax description, which explicitly allows whitespace between the link text and the link label. It brings reference links in line with inline links, which (according to both original Markdown and this spec) cannot have whitespace after the link text. More importantly, it prevents inadvertent capture of consecutive shortcut reference links. If whitespace is allowed between the link text and the link label, then in the following we will have a single reference link, not two shortcut reference links, as intended:

[foo]
[bar]

[foo]: /url1
[bar]: /url2

(Note that shortcut reference links were introduced by Gruber himself in a beta version of Markdown.pl, but never included in the official syntax description. Without shortcut reference links, it is harmless to allow space between the link text and link label; but once shortcut references are introduced, it is too dangerous to allow this, as it frequently leads to unintended results.)
When there are multiple matching link reference definitions, the first is used:

Note that matching is performed on normalized strings, not parsed inline content. So the following does not match, even though the labels define equivalent inline content:

Link labels cannot contain brackets, unless they are backslash-escaped:

Note that in this example ] is not backslash-escaped:

link label must contain at least one non-whitespace character:

collapsed reference link consists of a link label that matches a link reference definition elsewhere in the document, followed by the string []. The contents of the first link label are parsed as inlines, which are used as the link’s text. The link’s URI and title are provided by the matching reference link definition. Thus, [foo][] is equivalent to [foo][foo].

The link labels are case-insensitive:

As with full reference links, whitespace is not allowed between the two sets of brackets:

shortcut reference link consists of a link label that matches a link reference definition elsewhere in the document and is not followed by [] or a link label. The contents of the first link label are parsed as inlines, which are used as the link’s text. The link’s URI and title are provided by the matching link reference definition. Thus, [foo] is equivalent to [foo][].

The link labels are case-insensitive:

A space after the link text should be preserved:

If you just want bracketed text, you can backslash-escape the opening bracket to avoid links:

Note that this is a link, because a link label ends with the first following closing bracket:

Full and compact references take precedence over shortcut references:

Inline links also take precedence:

In the following case [bar][baz] is parsed as a reference, [foo] as normal text:

Here, though, [foo][bar] is parsed as a reference, since [bar] is defined:

Here [foo] is not parsed as a shortcut reference, because it is followed by a link label (even though [bar]is not defined):