From 3a01dbbb83a8e0bc1ee2fba43b48fd828fe9dcbe Mon Sep 17 00:00:00 2001 From: macchiati Date: Tue, 2 Apr 2024 13:42:30 -0700 Subject: [PATCH 1/6] CLDR-17491 Update LDML Mods section --- docs/ldml/tr35.md | 70 +++++++++++++++-------------------------------- 1 file changed, 22 insertions(+), 48 deletions(-) diff --git a/docs/ldml/tr35.md b/docs/ldml/tr35.md index 810652c2180..38eed33fe0a 100644 --- a/docs/ldml/tr35.md +++ b/docs/ldml/tr35.md @@ -4077,22 +4077,35 @@ Other contributors to CLDR are listed on the [CLDR Project Page](https://www.uni **Differences from LDML Version 44.1** * Part 1: [Core](tr35.md#Contents) - * In [Parent Locales](tr35.md#Parent_Locales), add an additional attribute and value `localeRules="nonlikelyScript"` to allow for algorithmic handling of inheritance so that mixtures of scripts are avoided. - For example, preventing `ru_Latn` from falling back to `ru` (which is Cyrillic script). - Previously, there was a small list of such locales, but they were far from covering all cases. - That list was retained for migration. + * In [Parent Locales](#Parent_Locales), added an additional attribute and value `localeRules="nonlikelyScript"` + to allow for algorithmic handling of inheritance so that mixtures of scripts are avoided. + For example, preventing `ru_Latn` from falling back to `ru` (which is Cyrillic script). + Previously, there was a small list of such locales, but they were far from covering all cases. + That list was retained for migration. + * In [Special Script Codes](#special-script-codes), added a description of special script codes, + such as Jpan and Aran. + * In [Lateral Inheritance](#Lateral_Inheritance), improved the formatting for clarity. + * In [Parent_Locales](#Parent_Locales), substantial changes to the way that parentLocales work, + including a new attribute that avoids needing a long (and fragile) list of language-script codes + to skip when falling back to root. + * In [Preprocessing](#preprocessing), restructure the steps for clarity, add more examples. * Part 3: [Numbers](tr35-numbers.md#Contents) - * In [Supplemental Currency Data](tr35-numbers.md#Supplemental_Currency_Data), for the `currency` element, added attributes `tz` and `to-tz` to clarify the `from` and `to` dates. + * In [Supplemental Currency Data](tr35-numbers.md#Supplemental_Currency_Data), for the `currency` element, + added attributes `tz` and `to-tz` to clarify the `from` and `to` dates. * Part 6: [Supplemental](tr35-info.md#Contents) * In [Mixed Units](tr35-info.md#mixed-units), clarified many aspects of mixed units (such as foot-and-inch), including how to handle rounding and precision. - * In [Testing](tr35-info.md#testing), listed the additional test files. - * In [Unit Preferences Overrides](tr35-info.md#Unit_Preferences_Overrides), added handling of edge cases, - such as where there is no quantity for a unit, or no preference data for a quantity. - Also clarified how to handle invalid subtags, and the usage of each of the subtags that affect unit preferences. + * In [Testing](tr35-info.html#testing), listed the additional test files. + * In [Unit Preferences Overrides](tr35-info.html#Unit_Preferences_Overrides), substantial changes including + handling of edge cases, such as where there is no quantity for a unit, or no preference data for a quantity; + how to handle invalid subtags; + negative unit amounts; + the usage of each of the subtags that affect unit preferences, and others. * In [Conversion Data](tr35-info.md#conversion-data), added the `special` attribute for `convertUnit`, used for handling beaufort. + * In [Unit Prefixes](unit-prefixes), added the SI unit prefixes and the power of 10 + (or 2, for binary prefixes) that they represent. * Part 7: [Keyboards] * There are substantial changes from v44 to bring the Keyboard 3.0 specification out of Tech Preview, including: @@ -4102,45 +4115,6 @@ Other contributors to CLDR are listed on the [CLDR Project Page](https://www.uni * Part 9: [MessageFormat](tr35-messageFormat.md#Contents) * Added the completely new specification for MessageFormat 2.0 (in Tech Preview) -**Differences from LDML Version 43 to 44.1** - -* [Core](#Contents) - * In [Time Zone Identifiers](#Time_Zone_Identifiers), added information on the new `iana` attribute for stability; also see information on `iana` in the section [U Extension Data Files](#Unicode_Locale_Extension_Data_Files). - * [Likely Subtags](#Likely_Subtags): There is a fix to how macroregions are handled by adding likely subtags, such as with `und_419` - * [Unicode Sets](#Unicode_Sets): New sections on the following, with additional clarifications: - * [UnicodeSet syntax](#unicodeset-syntax) - * [Backslash Escapes](#Backslash_Escapes) - * [Variables in UnicodeSets](#Variables_in_UnicodeSets) - * [Unicode Language Identifier](#unicode-language-identifier): clarified constraint on duplicate subtags. - * [Key/Type Definitions](#key-and-type-definitions): clarified definition of `-dx` - * [EBNF](#ebnf): Clarified use of EBNF in LDML - * (44.1)[Key/Type Definitions](#key-and-type-definitions): further clarified the definition of `-dx` - -* [General](tr35-general.md#Contents) - * Added new section [Unit Identifier Uniqueness](tr35-general.md#Unit_Identifier_Uniqueness), and added a relevant constraint on base_component in the [Syntax](tr35-general.md#syntax) section. - * Several clarifications were added in [Transform Rules Syntax](tr35-general.md#Transform_Rules_Syntax), and a new section [Transform Syntax Characters](tr35-general.md#transform-syntax-characters) was added with a table of the characters. - * (44.1) [Synthesizing Sequence Names](tr35-general.md#SynthesizingNames) Added handling of derived emoji names and keywords for emoji facing-right sequences. - -* [Dates](tr35-dates.md#Contents) - * New section [First Day Overrides](tr35-dates.md#first-day-overrides): Described the various locale ID elements that affect determination of the first day of the week (for week of year calculations), and the order in which they should be considered. Also noted in [Key/Type Definitions](#Key_Type_Definitions) which keys can affect determination of first day. - -* [Supplemental](tr35-info.md#Contents) - * In [Conversion Data](tr35-info.md#conversion-data), expanded the list of values for the convertUnit systems attribute. - * Added new section [Derived Unit System](tr35-info.md#derived-unit-system) - * Rewrote and clarified the material in [Unit Preferences Overrides](tr35-info.md#Unit_Preferences_Data) - -* [Keyboards](tr35-keyboards.md#Contents) - * Complete rewrite of the specification by the Keyboard Subcommittee. Available as a technical preview in CLDR version 44. See [Part 7: Status](tr35-keyboards.md#status). - -* [Person Names](tr35-personNames.md#Contents) - * Added material in [API Implementaion](tr35-personNames.md#api-implementation) on recommended implementation API options. - * Describe new [parameterDefault Element](tr35-personNames.md#parameterdefault-element) element that specifies default formality and length. - * Describe new [nativeSpaceReplacement Element](tr35-personNames.md#nativespacereplacement-element) that specifies how spaces should be handled when the name language is the same as the formatting language. - * In [Modifiers](tr35-personNames.md#modifiers) added the modifiers retain, genitive and vocative. - * Added sections on [Grammatical Modifiers for Names](tr35-personNames.md#grammatical-modifiers-for-names) and [Future Modifiers](tr35-personNames.md#future-modifiers). - * Fixed a problem in [Switch the formatting locale if necessary](tr35-personNames.md#switch-the-formatting-locale-if-necessary), where the full formatting locale wasn't being set correctly when the name object has a locale whose script is incompatibility with name script. - * Rewrote the section on [Setting the spaceReplacement](tr35-personNames.md#setting-the-spacereplacement). - Note that small changes such as typos and link fixes are not listed above. Modifications in previous versions are listed in those respective versions. Click on **Previous Version** in the header until you get to the desired version. From 9dc86990e1630405dc2345a1779fe46280b9fca3 Mon Sep 17 00:00:00 2001 From: macchiati Date: Tue, 2 Apr 2024 13:46:43 -0700 Subject: [PATCH 2/6] CLDR-17491 cleanup --- docs/ldml/tr35.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/docs/ldml/tr35.md b/docs/ldml/tr35.md index 38eed33fe0a..a733d3708cc 100644 --- a/docs/ldml/tr35.md +++ b/docs/ldml/tr35.md @@ -4088,7 +4088,7 @@ Other contributors to CLDR are listed on the [CLDR Project Page](https://www.uni * In [Parent_Locales](#Parent_Locales), substantial changes to the way that parentLocales work, including a new attribute that avoids needing a long (and fragile) list of language-script codes to skip when falling back to root. - * In [Preprocessing](#preprocessing), restructure the steps for clarity, add more examples. + * In [LocaleId Canonicalization:Preprocessing](#preprocessing), restructured the steps for clarity, added more examples. * Part 3: [Numbers](tr35-numbers.md#Contents) * In [Supplemental Currency Data](tr35-numbers.md#Supplemental_Currency_Data), for the `currency` element, @@ -4098,7 +4098,7 @@ Other contributors to CLDR are listed on the [CLDR Project Page](https://www.uni * In [Mixed Units](tr35-info.md#mixed-units), clarified many aspects of mixed units (such as foot-and-inch), including how to handle rounding and precision. * In [Testing](tr35-info.html#testing), listed the additional test files. - * In [Unit Preferences Overrides](tr35-info.html#Unit_Preferences_Overrides), substantial changes including + * In [Unit Preferences Overrides](tr35-info.html#Unit_Preferences_Overrides), made substantial changes including handling of edge cases, such as where there is no quantity for a unit, or no preference data for a quantity; how to handle invalid subtags; negative unit amounts; @@ -4108,7 +4108,7 @@ Other contributors to CLDR are listed on the [CLDR Project Page](https://www.uni (or 2, for binary prefixes) that they represent. * Part 7: [Keyboards] - * There are substantial changes from v44 to bring the Keyboard 3.0 specification out of Tech Preview, including: + * Added substantial changes from v44 to bring the Keyboard 3.0 specification out of Tech Preview, including: * New sections for Definitions, Notation, and Normalization. * Many clarifications and modifications in other sections. From ca68f4c4bcb972132333d64230b66d8501638a14 Mon Sep 17 00:00:00 2001 From: macchiati Date: Tue, 2 Apr 2024 17:35:20 -0700 Subject: [PATCH 3/6] CLDR-17491 Added notes for CLDR-17251 and CLDR-17217 --- docs/ldml/tr35.md | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/docs/ldml/tr35.md b/docs/ldml/tr35.md index a733d3708cc..7646e7fac14 100644 --- a/docs/ldml/tr35.md +++ b/docs/ldml/tr35.md @@ -4089,11 +4089,19 @@ Other contributors to CLDR are listed on the [CLDR Project Page](https://www.uni including a new attribute that avoids needing a long (and fragile) list of language-script codes to skip when falling back to root. * In [LocaleId Canonicalization:Preprocessing](#preprocessing), restructured the steps for clarity, added more examples. + * In [Likely Subtags](#Likely_Subtags), clarified that language subtags iw, in, and yi are treated specially in the data, + to allow for applications that use them as canonical language subtags. + Also removed the substitution for macroregions, + and noted that some elements could be NOOPs in customized data, but could be misleading. * Part 3: [Numbers](tr35-numbers.md#Contents) * In [Supplemental Currency Data](tr35-numbers.md#Supplemental_Currency_Data), for the `currency` element, added attributes `tz` and `to-tz` to clarify the `from` and `to` dates. +* Part 4: [Dates](tr35-dates.md#Contents) + * In [Date Format Patterns](tr35-dates.md#Date_Format_Patterns), reserved date Pattern field lengths of greater than 16 + as private use. + * Part 6: [Supplemental](tr35-info.md#Contents) * In [Mixed Units](tr35-info.md#mixed-units), clarified many aspects of mixed units (such as foot-and-inch), including how to handle rounding and precision. From ddd595c418194c5aec08569619e27fe59c9c3caf Mon Sep 17 00:00:00 2001 From: macchiati Date: Tue, 2 Apr 2024 20:55:37 -0700 Subject: [PATCH 4/6] CLDR-17491 Fixed the duplicate entries that Peter noticed. --- docs/ldml/tr35.md | 13 +++++-------- 1 file changed, 5 insertions(+), 8 deletions(-) diff --git a/docs/ldml/tr35.md b/docs/ldml/tr35.md index 7646e7fac14..9108ff56c51 100644 --- a/docs/ldml/tr35.md +++ b/docs/ldml/tr35.md @@ -4077,17 +4077,14 @@ Other contributors to CLDR are listed on the [CLDR Project Page](https://www.uni **Differences from LDML Version 44.1** * Part 1: [Core](tr35.md#Contents) - * In [Parent Locales](#Parent_Locales), added an additional attribute and value `localeRules="nonlikelyScript"` - to allow for algorithmic handling of inheritance so that mixtures of scripts are avoided. - For example, preventing `ru_Latn` from falling back to `ru` (which is Cyrillic script). - Previously, there was a small list of such locales, but they were far from covering all cases. - That list was retained for migration. + * In [Parent Locales](#Parent_Locales), made substantial changes to the way that parentLocales work, + including a new attribute for algorithmic handling of inheritance + that avoids needing a long (and fragile) list of language-script codes + to skip when falling back to root. + That list was retained for migration, but will be withdrawn in the future. * In [Special Script Codes](#special-script-codes), added a description of special script codes, such as Jpan and Aran. * In [Lateral Inheritance](#Lateral_Inheritance), improved the formatting for clarity. - * In [Parent_Locales](#Parent_Locales), substantial changes to the way that parentLocales work, - including a new attribute that avoids needing a long (and fragile) list of language-script codes - to skip when falling back to root. * In [LocaleId Canonicalization:Preprocessing](#preprocessing), restructured the steps for clarity, added more examples. * In [Likely Subtags](#Likely_Subtags), clarified that language subtags iw, in, and yi are treated specially in the data, to allow for applications that use them as canonical language subtags. From 7a6c49a8925082a9da5eb4f0d5a3c1efb8d2b06e Mon Sep 17 00:00:00 2001 From: macchiati Date: Tue, 2 Apr 2024 20:58:47 -0700 Subject: [PATCH 5/6] CLDR-17491 Missing link on Keyboards in mod section --- docs/ldml/tr35.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/ldml/tr35.md b/docs/ldml/tr35.md index 9108ff56c51..b7c5c047071 100644 --- a/docs/ldml/tr35.md +++ b/docs/ldml/tr35.md @@ -4112,7 +4112,7 @@ Other contributors to CLDR are listed on the [CLDR Project Page](https://www.uni * In [Unit Prefixes](unit-prefixes), added the SI unit prefixes and the power of 10 (or 2, for binary prefixes) that they represent. -* Part 7: [Keyboards] +* Part 7: [Keyboards](tr35-keyboards.md#Contents) * Added substantial changes from v44 to bring the Keyboard 3.0 specification out of Tech Preview, including: * New sections for Definitions, Notation, and Normalization. * Many clarifications and modifications in other sections. From 9676e74a7e3f3f78c9c3cfb421b153a3df4b3d83 Mon Sep 17 00:00:00 2001 From: macchiati Date: Wed, 3 Apr 2024 08:12:42 -0700 Subject: [PATCH 6/6] CLDR-17491 Noted changes to wellformedness and validity clauses for locale identifiers --- docs/ldml/tr35.md | 3 +++ 1 file changed, 3 insertions(+) diff --git a/docs/ldml/tr35.md b/docs/ldml/tr35.md index b7c5c047071..1918fc4c288 100644 --- a/docs/ldml/tr35.md +++ b/docs/ldml/tr35.md @@ -4090,6 +4090,9 @@ Other contributors to CLDR are listed on the [CLDR Project Page](https://www.uni to allow for applications that use them as canonical language subtags. Also removed the substitution for macroregions, and noted that some elements could be NOOPs in customized data, but could be misleading. + * In [EBNF](#ebnf), added more differences from W3C EBNF, + and documented use of wfc: and vc: for wellformedness and validity constraints. + Marked clauses with that format where appropriate, and grouped constraints after the relevant EBNF. * Part 3: [Numbers](tr35-numbers.md#Contents) * In [Supplemental Currency Data](tr35-numbers.md#Supplemental_Currency_Data), for the `currency` element,