From 775ffd572194e13b1bf82d476f524b181addc934 Mon Sep 17 00:00:00 2001 From: "Steven R. Loomis" Date: Tue, 3 Oct 2023 23:19:59 -0500 Subject: [PATCH] CLDR-17145 spec: update Toc (#3311) --- docs/ldml/tr35-dates.md | 3 ++- docs/ldml/tr35-general.md | 1 + docs/ldml/tr35-info.md | 2 ++ docs/ldml/tr35-personNames.md | 4 ++++ docs/ldml/tr35.md | 30 +++++++++++++++++------------- 5 files changed, 26 insertions(+), 14 deletions(-) diff --git a/docs/ldml/tr35-dates.md b/docs/ldml/tr35-dates.md index 24f7313ebe0..99e04f38fb5 100644 --- a/docs/ldml/tr35-dates.md +++ b/docs/ldml/tr35-dates.md @@ -63,6 +63,7 @@ The LDML specification is divided into the following parts: * [Calendar Preference Data](#Calendar_Preference_Data) * [Week Data](#Week_Data) * Table: [Week Designation Types](#Week_Designation_Types) + * [First Day Overrides](#first-day-overrides) * [Time Data](#Time_Data) * [Day Period Rule Sets](#Day_Period_Rule_Sets) * [Day Period Rules](#Day_Period_Rules) @@ -1169,7 +1170,7 @@ The calculation of the first day of the week depends on various fields in a loca 3. Else if there is a valid `-u-ca-` calendar value, where that calendar specifies the first day, then return that first day. (Most calendars do not specify the first day.) 4. Else if there is an explicit region subtag, then return that region's firstDay map value. 5. Else if there is a valid `-u-sd-` subdivision value, return that region's firstDay map value. -6. Else if the [Add Likely Subtags](tr35.html#Likely_Subtags) algorithm produces a region, return that region's firstDay map value. +6. Else if the [Add Likely Subtags](tr35.md#Likely_Subtags) algorithm produces a region, return that region's firstDay map value. 7. Else return the firstDay map value for 001. *Example:* diff --git a/docs/ldml/tr35-general.md b/docs/ldml/tr35-general.md index d2456ffdbb4..67a963659d4 100644 --- a/docs/ldml/tr35-general.md +++ b/docs/ldml/tr35-general.md @@ -69,6 +69,7 @@ The LDML specification is divided into the following parts: * [Unit Identifiers](#Unit_Identifiers) * [Nomenclature](#nomenclature) * [Syntax](#syntax) + * [Unit Identifier Uniqueness](#Unit_Identifier_Uniqueness) * [Example Units](#Example_Units) * [Compound Units](#compound-units) * [Precomposed Compound Units](#precomposed-compound-units) diff --git a/docs/ldml/tr35-info.md b/docs/ldml/tr35-info.md index e07b9eee996..0cbae158968 100644 --- a/docs/ldml/tr35-info.md +++ b/docs/ldml/tr35-info.md @@ -77,6 +77,8 @@ The LDML specification is divided into the following parts: * [Unit Parsing Data](#unit-parsing-data) * [Constants](#constants) * [Conversion Data](#conversion-data) + * [Derived Unit System](#derived-unit-system) + * [Conversion Mechanisms](#conversion-mechanisms) * [Exceptional Cases](#exceptional-cases) * [Identities](#identities) * [Aliases](#aliases) diff --git a/docs/ldml/tr35-personNames.md b/docs/ldml/tr35-personNames.md index dd9f2335655..aeaec787a4c 100644 --- a/docs/ldml/tr35-personNames.md +++ b/docs/ldml/tr35-personNames.md @@ -52,7 +52,9 @@ The LDML specification is divided into the following parts: * [personNames Element](#personnames-element) * [personName Element](#personname-element) * [nameOrderLocales Element](#nameorderlocales-element) + * [parameterDefault Element](#parameterdefault-element) * [foreignSpaceReplacement Element](#foreignspacereplacement-element) + * [nativeSpaceReplacement Element](#nativespacereplacement-element) * [initialPattern Element](#initialpattern-element) * [Syntax](#syntax) * [Person Name Object](#person-name-object) @@ -64,6 +66,8 @@ The LDML specification is divided into the following parts: * [namePattern Syntax](#namepattern-syntax) * [Fields](#fields) * [Modifiers](#modifiers) + * [Grammatical Modifiers for Names](#grammatical-modifiers-for-names) + * [Future Modifiers](#future-modifiers) * [Formatting Process](#formatting-process) * [Derive the name locale](#derive-the-name-locale) * [Derive the formatting locale](#derive-the-formatting-locale) diff --git a/docs/ldml/tr35.md b/docs/ldml/tr35.md index 15efaef01a5..f74cafd7d0f 100644 --- a/docs/ldml/tr35.md +++ b/docs/ldml/tr35.md @@ -135,9 +135,13 @@ The LDML specification is divided into the following parts: * [Date and Date Ranges](#Date_Ranges) * [Text Directionality](#Text_Directionality) * [Unicode Sets](#Unicode_Sets) + * [UnicodeSet syntax](#unicodeset-syntax) + * [Syntax Special Case Examples](#syntax-special-case-examples) * [Lists of Code Points](#Lists_of_Code_Points) + * [Backslash Escapes](#Backslash_Escapes) * [Unicode Properties](#Unicode_Properties) * [Boolean Operations](#Boolean_Operations) + * [Variables in UnicodeSets](#Variables_in_UnicodeSets) * [UnicodeSet Examples](#UnicodeSet_Examples) * [String Range](#String_Range) * [Identity Elements](#Identity_Elements) @@ -993,7 +997,7 @@ For example: ``` -Above `` element defines the short time zone ID "inccu" (for the use in the Unicode locale extension), corresponding _CLDR canonical "long" ID_ "Asia/Culcutta", and an alias "Asia/Kolkata". In the tz database, the preferred ID for this time zone is "Asia/Kolkata". +Above `` element defines the short time zone ID "inccu" (for the use in the Unicode locale extension), corresponding _CLDR canonical "long" ID_ "Asia/Culcutta", and an alias "Asia/Kolkata". In the tz database, the preferred ID for this time zone is "Asia/Kolkata". **Links in the tz database** @@ -2235,13 +2239,13 @@ For example, if und_AF ⇒ fa_Arab_AF, then: There are a few exceptions to this goal: * A 'denormalized' subtag changes to the normalized form, except for certain denormalized language subtags such as 'iw' (for 'he' = Hebrew) which may occur in both the 'from' and 'to' fields of the data. This allows for implementations that use those denormalized subtags to use the data with only minor changes to the operations. -* A macroregion (such as West Africa = 011) _may_ change to a specific country (Nigeria = NG). +* A macroregion (such as West Africa = 011) _may_ change to a specific country (Nigeria = NG). **_Remove_** _**Likely Subtags:** Given a locale, remove any fields that Add Likely Subtags would add._ The reverse operation removes fields that could be added by the first operation. -1. First get max = AddLikelySubtags(inputLocale). +1. First get max = AddLikelySubtags(inputLocale). 2. If an error is signaled in AddLikelySubtags, signal that same error and stop. 3. Remove the variants and extensions from max. 4. Get the components of the max (_languagemax_, _scriptmax_, _regionmax_). @@ -2251,20 +2255,20 @@ The reverse operation removes fields that could be added by the first operation. Example: -* Input is zh_Hant or zh_TW. +* Input is zh_Hant or zh_TW. * Maximize to get zh_Hant_TW. * zh => zh_Hans_CN. No match, so continue. * zh_TW => zh_Hant_TW. Matches, so return **zh_TW**. **_Remove_** _**Likely Subtags, favoring script:** Given a locale, remove any fields that Add Likely Subtags would add, but favor script over region._ -A variant of this favors the script over the region, thus using {language, language_script, language_region} in the step #4 above. -This variant much less commonly used, only when the script relationship is more significant to users. +A variant of this favors the script over the region, thus using {language, language_script, language_region} in the step #4 above. +This variant much less commonly used, only when the script relationship is more significant to users. Here is the difference: Example: -* Input is zh_Hant or zh_TW. +* Input is zh_Hant or zh_TW. * Maximize to get zh_Hant_TW. * zh => zh_Hans_CN. No match, so continue. * zh_Hant => zh_Hant_TW. Matches, so return **zh_Hant**. @@ -2904,13 +2908,13 @@ However, that feature can be supported in clients such as ICU by implementing a A UnicodeSet may be cited in specifications outside of the domain of LDML. In such a case, that specification may specify a subset or superset of the syntax provided here. -##### UnicodeSet syntax ##### +##### UnicodeSet syntax | Symbol | Expression | Examples | | -------------- | -------------------------------------------------------------- | --------------------------------------- | | `unicodeSet` |
= prop
\| '\[' '^'? s '-'? s seq\* \[\\$ \\-\]? s '\]'
\| var
| \\p\{x=y\},
[abc],
$myset | | `seq` |
= unicodeSet \(s \[\\&\\\-\] s unicodeSet\)\* s
\| range s
| \[abc\]\-\[cde\], a | -| `range` |
= element \('\-' element\)?       | a, a\-c, \{abc\}, a\-\{z\}  
_note: in ranges, elements must resolve to exactly one code point._ | +| `range` |
= element \('\-' element\)?       | a, a\-c, \{abc\}, a\-\{z\}  
_note: in ranges, elements must resolve to exactly one code point._ | | `element` |
= char \| string \| var 
| %, b, \{hello\}, \{\}, \\x\{61 62\} | | `prop` |
= '\\' \[pP\] '\{' propName \(\[≠=\] s pValuePerl\+\)? '\}'
\| '\[:' '^'? propName \(\[≠=\] s pValuePosix\+\)? ':\]'
| \\p\{x=y\}, \[:x=y:\]
| | `propName` |
= s \[A\-Za\-z0\-9\] \[A\-Za\-z0\-9\_\\x20\]\* s
| General\_Category,
General Category | @@ -3080,10 +3084,10 @@ Variables are equivalent normalized identifiers with Normalization Form C, imple Notes: -1. The 'type' of a variable value is not specified syntactically. +1. The 'type' of a variable value is not specified syntactically. Thus \[\$a\-\$b\] can resolve whether \$a and \$b are chars/strings (eg, \$a=δ, \$b=θ) or full UnicodeSets (eg, \$a=\\p\{script=greek\}, \$b=\\p\{general_category=letter\}). The only restriction is that the result be syntactic; thus (\$a=w, \$b=xy) would raise an error. -2. Variable substitution is currently disallowed inside of property expressions. +2. Variable substitution is currently disallowed inside of property expressions. Thus \\p{gc=\$blah} raises an error. 3. '\$' when followed by '\]' is interpreted as \\uFFFF, and is used to match before the start of a string or after the end. Thus \[ab\$\] matches the string "xaby" in the locations (marked with '()'): "()xaby", "x(a)by", "xa(b)y", "xaby()". @@ -4020,8 +4024,8 @@ Other contributors to CLDR are listed on the [CLDR Project Page](https://www.uni **Differences from LDML Version 43** -* [Person Names](tr35-personNames.html#Contents) - * Fixed a problem in [Switch the formatting locale if necessary](tr35-personNames.html#switch-the-formatting-locale-if-necessary), where the full formatting locale wasn't being set correctly when the name object has a locale whose script is incompatibility with name script. +* [Person Names](tr35-personNames.md#Contents) + * Fixed a problem in [Switch the formatting locale if necessary](tr35-personNames.md#switch-the-formatting-locale-if-necessary), where the full formatting locale wasn't being set correctly when the name object has a locale whose script is incompatibility with name script. * [Likely Subtags](#Likely_Subtags) * There is a fix to how macroregions are handled by adding likely subtags, such as with und_419