Skip to content

Commit

Permalink
Improve legibility of JSON-encoded Interactivity API store data.
Browse files Browse the repository at this point in the history
The Interactivity API has been rendering client data in a SCRIPT element with the
type `application/json` so that it's not executed as a script, but is available
to one. The data runs through `wp_json_encode()` and is encoded with some flags
to ensure that potentially-dangerous characters are escaped.

However, this can lead to some challenges. Eagerly escaping when not necessary
can make the data difficult to comprehend when reading the output HTML. For example,
all non-ASCII Unicode characters are escaped with their code point equivalent.
This results in `\ud83c\udd70` instead of `🅰`.

In this patch, the flags for JSON encoding are refined to ensure what's necessary
while relaxing other rules (leaving in those Unicode characters if the blog charset
is UTF-8). This makes for Interactivity API data that's quicker as a human reader
to decipher and diagnose.

In summary:

 - This data is JSON encoded and printed in a `<script type="application/json">` tag.

 - If we ensure that `<` is never printed inside the data, it should be impossible to
   break out of the script tag and the browser treats everything as the element's `textContent`.

 - All other escaping becomes unnecessary at that point, including unicode escaping 
   if the page uses the UTF-8 charset (the same encoding as JSON).

See WordPress/wordpress-develop#6433 (review)

Developed in WordPress/wordpress-develop#6520
Discussed in https://core.trac.wordpress.org/ticket/61170

Fixes: #61170
Follow-up to: [57563].
Props: bjorsch, dmsnell, jonsurrell, sabernhardt, westonruter.

Built from https://develop.svn.wordpress.org/trunk@58159


git-svn-id: https://core.svn.wordpress.org/trunk@57622 1a063a9b-81f0-0310-95a4-ce76da25c4cd
  • Loading branch information
dmsnell committed May 15, 2024
1 parent 7278c35 commit c3df3ef
Show file tree
Hide file tree
Showing 2 changed files with 33 additions and 2 deletions.
33 changes: 32 additions & 1 deletion wp-includes/interactivity-api/class-wp-interactivity-api.php
Original file line number Diff line number Diff line change
Expand Up @@ -167,10 +167,41 @@ public function print_client_interactivity_data() {
}

if ( ! empty( $interactivity_data ) ) {
/*
* This data will be printed as JSON inside a script tag like this:
* <script type="application/json"></script>
*
* A script tag must be closed by a sequence beginning with `</`. It's impossible to
* close a script tag without using `<`. We ensure that `<` is escaped and `/` can
* remain unescaped, so `</script>` will be printed as `\u003C/script\u00E3`.
*
* - JSON_HEX_TAG: All < and > are converted to \u003C and \u003E.
* - JSON_UNESCAPED_SLASHES: Don't escape /.
*
* If the page will use UTF-8 encoding, it's safe to print unescaped unicode:
*
* - JSON_UNESCAPED_UNICODE: Encode multibyte Unicode characters literally (instead of as `\uXXXX`).
* - JSON_UNESCAPED_LINE_TERMINATORS: The line terminators are kept unescaped when
* JSON_UNESCAPED_UNICODE is supplied. It uses the same behaviour as it was
* before PHP 7.1 without this constant. Available as of PHP 7.1.0.
*
* The JSON specification requires encoding in UTF-8, so if the generated HTML page
* is not encoded in UTF-8 then it's not safe to include those literals. They must
* be escaped to avoid encoding issues.
*
* @see https://www.rfc-editor.org/rfc/rfc8259.html for details on encoding requirements.
* @see https://www.php.net/manual/en/json.constants.php for details on these constants.
* @see https://html.spec.whatwg.org/#script-data-state for details on script tag parsing.
*/
$json_encode_flags = JSON_HEX_TAG | JSON_UNESCAPED_SLASHES | JSON_UNESCAPED_UNICODE | JSON_UNESCAPED_LINE_TERMINATORS;
if ( ! is_utf8_charset() ) {
$json_encode_flags = JSON_HEX_TAG | JSON_UNESCAPED_SLASHES;
}

wp_print_inline_script_tag(
wp_json_encode(
$interactivity_data,
JSON_HEX_TAG | JSON_HEX_AMP
$json_encode_flags
),
array(
'type' => 'application/json',
Expand Down
2 changes: 1 addition & 1 deletion wp-includes/version.php
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@
*
* @global string $wp_version
*/
$wp_version = '6.6-alpha-58157';
$wp_version = '6.6-alpha-58159';

/**
* Holds the WordPress DB revision, increments when changes are made to the WordPress DB schema.
Expand Down

0 comments on commit c3df3ef

Please sign in to comment.