-
-
Notifications
You must be signed in to change notification settings - Fork 229
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Markdown: turn plain URLs into links (#2884)
By default, plain link like https://domain.com are not shown as links in markdown (see [commonmark spec](https://spec.commonmark.org/0.30/#autolinks)) and instead have to be written in the `[text](https://domain.com)` style. This PR adds autolinking to plain links. This is done in two different way - once for the SimpleMarkdown component that uses react-remark via a remark transformation plugin - and once for the MarkdownTextWrap component by doing the translation as part of the mapping to IRNodes In theory it should be possible to use the remark plugin in both cases, but navigating the set of interdependent libraries at the state two years ago is a pain and so I went for the two bespoke versions. When we upgrade our setup to be ESM everywhere we can upgrade to the latest versions of the unified, react-remark et al libraries and try again to use the same code in both cases.
- Loading branch information
Showing
6 changed files
with
60 additions
and
6 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
1 change: 1 addition & 0 deletions
1
packages/@ourworldindata/components/src/markdown/mdast-util-find-and-replace.d.ts
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
declare module "mdast-util-find-and-replace" |
37 changes: 37 additions & 0 deletions
37
packages/@ourworldindata/components/src/markdown/remarkPlainLinks.ts
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,37 @@ | ||
import findAndReplace from "mdast-util-find-and-replace" | ||
|
||
// This regex matches: | ||
// "http" | ||
// an optional "s" | ||
// two / characters | ||
// The subdomains and hostname: Any word or numeric character or "_" or "-" one or more times followed by a period | ||
// The TLD: Any word or numeric character or "_" or "-" one or more times | ||
// The path, query string and fragment: A forward slash followed by any word or numeric character (unicode classes so umlauts like ö match | ||
// as well as any of the following: .+?:%&=~#) zero or more times. Note that we exclude space even though that is valid in a URL but it tends | ||
// to make the match too greedy. | ||
// We match the same subgroup [\p{L}\p{N}_\-.\+/?:%&=~#] twice, once with a * and then excactly once but without interpuncation characters .?: | ||
// This is to make sure that we don't match trailing punctuation as part of the URL ("This is an http://example.com." - note that the leading | ||
// period should not be part of the URL) | ||
// Finally, the very last part is a lone forward slash which would not be matched by the previous subgroup. | ||
export const urlRegex = | ||
/https?:\/\/([\w-]+\.)+[\w-]+((\/[\p{L}\p{N}_\-.\+/?:%&=~#]*[\p{L}\p{N}_\-\+/%&=~#])|\/)?/gu | ||
|
||
export function remarkPlainLinks() { | ||
const turnIntoLink = (value: any, _match: string) => { | ||
return [ | ||
{ | ||
type: "link", | ||
url: value, | ||
children: [ | ||
{ | ||
type: "text", | ||
value: value, | ||
}, | ||
], | ||
}, | ||
] | ||
} | ||
return (tree: any) => { | ||
findAndReplace(tree, [[urlRegex, turnIntoLink]]) | ||
} | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters