Skip to content

Commit

Permalink
CLDR-17194 Further clarifications of dx (#3411)
Browse files Browse the repository at this point in the history
* CLDR-17194 Further clarifications of dx

* CLDR-17194 Copyedit

(cherry picked from commit b3057ea)
  • Loading branch information
macchiati authored and pedberg-icu committed Dec 8, 2023
1 parent 2e68119 commit 0067b0e
Showing 1 changed file with 7 additions and 3 deletions.
10 changes: 7 additions & 3 deletions docs/ldml/tr35.md
Original file line number Diff line number Diff line change
Expand Up @@ -802,9 +802,12 @@ The BCP 47 form for keys and types is the canonical form, and recommended. Other
<tr><td>"dx"</td>
<td>Dictionary break script exclusions</td>
<td><i><code><a href="#unicode_script_subtag">unicode_script_subtag</a></code> values</i></td>
<td><p>One or more items of type SCRIPT_CODE (as usual, separated by hyphens), which are valid <code><a href="#unicode_script_subtag">unicode_script_subtag</a></code> values.</p>
<p>The code Zyyy (Common) can be specified to exclude all scripts, in which case it should be the only SCRIPT_CODE value specified.
If others are included mistakenly, they are ignored.</p></td></tr>
<td><ul><li>One or more items of type SCRIPT_CODE (as usual, separated by hyphens), which are valid <code><a href="#unicode_script_subtag">unicode_script_subtag</a></code> values.</li>
<li>Each of the values for the DX key must be a short script property value in the UCD, or one of the compound script values like jpan. The compound script values are expanded when interpreted, eg, -dx-jpan = -dx-hani-hira-kata</li>
<li>The values may be in any order, eg, -dx-thai-hani = dx-hani-thai. However, the canonical order for the bcp47 subtag is alphabetical, eg, dx-hani-thai</li>
<li>Dictionary-based break iterators will ignore each character whose Script_Extension value set intersects with the DX value set.</li>
<li>The code Zyyy (Common) can be specified to exclude all scripts, if and only if it is the only SCRIPT_CODE value specified. If it is not the only script code, Zyyy has the normal meaning: excluding Script_Extension=Common.</li></ul>
</td></tr>

<tr><td colspan="4"><b>A <a name="UnicodeEmojiPresentationStyleIdentifier" id="UnicodeEmojiPresentationStyleIdentifier" href="#UnicodeEmojiPresentationStyleIdentifier">Unicode Emoji Presentation Style Identifier</a> specifies a request for the preferred emoji presentation style. This can be used as part of the value for an HTML lang attribute, for example <code>&lt;html lang="sr-Latn-u-em-emoji"&gt;</code>. The valid values are those <i>name</i> attribute values in the <i>type</i> elements of key name="em" in bcp47/<a href="https://github.com/unicode-org/cldr/blob/main/common/bcp47/variant.xml" target="_blank">variant.xml</a></b>.</td></tr>
<tr><td rowspan="3">"em"</td>
Expand Down Expand Up @@ -4029,6 +4032,7 @@ Other contributors to CLDR are listed on the [CLDR Project Page](https://www.uni
* [Unicode Language Identifier](#unicode-language-identifier): clarified constraint on duplicate subtags.
* [Key/Type Definitions](#key-and-type-definitions): clarified definition of `-dx`
* [EBNF](#ebnf): Clarified use of EBNF in LDML
* (44.1)[Key/Type Definitions](#key-and-type-definitions): further clarified the definition of `-dx`

* [General](tr35-general.md#Contents)
* Added new section [Unit Identifier Uniqueness](tr35-general.md#Unit_Identifier_Uniqueness), and added a relevant constraint on base_component in the [Syntax](tr35-general.md#syntax) section.
Expand Down

0 comments on commit 0067b0e

Please sign in to comment.