From 16d21ecb69f049d41c12eb129b7551ef3477af90 Mon Sep 17 00:00:00 2001 From: Mark Davis Date: Tue, 3 Oct 2023 21:46:33 +0200 Subject: [PATCH] CLDR-16866 Add algorithm for first day of week. (#3303) * CLDR-16866 Add algorithm for first day of week. * CLDR-16866 formatting tweaks * CLDR-16866 Add examples * CLDR-16866 Fix iso example --- docs/ldml/tr35-dates.md | 24 ++++++++++++++++++++++++ docs/ldml/tr35.md | 12 +++++++++--- 2 files changed, 33 insertions(+), 3 deletions(-) diff --git a/docs/ldml/tr35-dates.md b/docs/ldml/tr35-dates.md index 9abca2e8b4d..24f7313ebe0 100644 --- a/docs/ldml/tr35-dates.md +++ b/docs/ldml/tr35-dates.md @@ -1160,6 +1160,30 @@ Each `weekOfPreference` element provides, for its specified locales, an ordered | weekOfDate | the week of April 11, 2016 | \\the week of {0}\<… | The date pattern that replaces {0} is determined separately and may use the first day or workday of the week, the range of the full week or work week, etc. | | weekOfInterval | the week of April 11–15 | \\the week of {0}\<… | (same comment as above) | +#### First Day Overrides + +The calculation of the first day of the week depends on various fields in a locale_identifier, according to the following algorithm. The data in the `firstDay` elements is treated as a map from region to day, with any missing value using the value for 001. + +1. If there is a valid `-u-fw-` day value, return that day. +2. Else if there is a valid `-u-rg-` region value, return that region's firstDay map value. +3. Else if there is a valid `-u-ca-` calendar value, where that calendar specifies the first day, then return that first day. (Most calendars do not specify the first day.) +4. Else if there is an explicit region subtag, then return that region's firstDay map value. +5. Else if there is a valid `-u-sd-` subdivision value, return that region's firstDay map value. +6. Else if the [Add Likely Subtags](tr35.html#Likely_Subtags) algorithm produces a region, return that region's firstDay map value. +7. Else return the firstDay map value for 001. + +*Example:* + +| Locale Identifier | "Winning" subtags | Region | +|----|----|----| +|en-AU-u-ca-iso8601-fw-tue-rg-afzzzz-sd-cabc | -fw-tue | n/a, uses Tuesday | +|en-AU-u-ca-iso8601-rg-afzzzz-sd-cabc | -rg-afzzzz | AF | +|en-AU-u-ca-iso8601-sd-cabc | -ca-iso8601 | n/a, uses Monday | +|en-AU-u-sd-cabc | -AU | AU | +|en-u-sd-cabc | -sd-cabc | CA | +|en | | US (from likely subtags) | +|zxx | 001 | (fallback) | + ### Time Data ```xml diff --git a/docs/ldml/tr35.md b/docs/ldml/tr35.md index c958d7ecae0..12559c6a1e6 100644 --- a/docs/ldml/tr35.md +++ b/docs/ldml/tr35.md @@ -721,6 +721,7 @@ The BCP 47 form for keys and types is the canonical form, and recommended. Other in bcp47/calendar.xml.
This selects calendar-specific data within a locale used for formatting and parsing, such as date/time symbols and patterns; it also selects supplemental calendarData used for calendrical calculations. + The value can affect the computation of the first day of the week: see First Day Overrides. "ca"
(calendar) Calendar algorithm

(For information on the calendar algorithms associated with the data used with these, see [Calendars].) @@ -799,8 +800,10 @@ The BCP 47 form for keys and types is the canonical form, and recommended. Other A Unicode First Day Identifier defines the preferred first day of the week for calendar display. Specifying "fw" in a locale identifier overrides the default value specified by supplemental - week data for the region (see Part 4 Dates, Week Data). The valid values are those name attribute values in the type elements - of key name="fw" in bcp47/calendar.xml. + week data for the region (see Part 4 Dates, Week Data). + The valid values are those name attribute values in the type elements + of key name="fw" in bcp47/calendar.xml. + The value can affect the computation of the first day of the week: see First Day Overrides. "fw" First day of week @@ -914,6 +917,7 @@ The BCP 47 form for keys and types is the canonical form, and recommended. Other "rg" Region Override"uszzzz"

The value is a unicode_subdivision_id of type “unknown” or “regular”; this consists of a unicode_region_subtag for a regular region (not a macroregion), suffixed either by “zzzz” (case is not significant) to designate the region as a whole, or by a unicode_subdivision_suffix to provide more specificity. For example, “en-GB-u-rg-uszzzz” represents a locale for British English but with region-specific defaults set to US for items such as default currency, default calendar and week data, default time cycle, and default measurement system and unit preferences. The determination of preferred units depends on the locale identifer: the keys ms, mu, rg, the base locale (language, script, region) and the user preferences. + The value can affect the computation of the first day of the week: see First Day Overrides. For information about preferred units and unit conversion, see Unit Conversion and Unit Preferences. … @@ -922,7 +926,9 @@ The BCP 47 form for keys and types is the canonical form, and recommended. Other "sd" Regional Subdivision "gbsct" - A unicode_subdivision_id, which is a unicode_region_subtag concatenated with a unicode_subdivision_suffix.
For example, gbsct is “gb”+“sct” (where sct represents the subdivision code for Scotland). Thus “en-GB-u-sd-gbsct” represents the language variant “English as used in Scotland”. And both “en-u-sd-usca” and “en-US-u-sd-usca” represent “English as used in California”. See 3.6.5 Subdivision Codes. + A unicode_subdivision_id, which is a unicode_region_subtag concatenated with a unicode_subdivision_suffix.
For example, gbsct is “gb”+“sct” (where sct represents the subdivision code for Scotland). Thus “en-GB-u-sd-gbsct” represents the language variant “English as used in Scotland”. And both “en-u-sd-usca” and “en-US-u-sd-usca” represent “English as used in California”. See 3.6.5 Subdivision Codes. + The value can affect the computation of the first day of the week: see First Day Overrides. + … A Unicode Sentence Break Suppressions Identifier defines a set of data to be used for suppressing certain sentence breaks that would otherwise be found by UAX #14 rules. The valid values are those name attribute values in the type elements of key name="ss" in bcp47/segmentation.xml.