Skip to content

Commit

Permalink
CLDR-17566 text diffs and minor changes
Browse files Browse the repository at this point in the history
  • Loading branch information
chpy04 committed Jul 4, 2024
1 parent c55d0f9 commit 4344afc
Show file tree
Hide file tree
Showing 6 changed files with 62 additions and 50 deletions.
Original file line number Diff line number Diff line change
@@ -1,4 +1,9 @@
Chinese (and other) calendar support, intercalary months, year cycles
Author Peter Edberg, with info and ideas from many others
Date 2011-11-20 through 2011-11-30, more 2012-01-10
Status Proposal
Feedback to pedberg (at) apple (dot) com
Bugs See list of tickets at the end of this document
Currently the ICU Calendar object has basic support for the Chinese calendar (can determine era, year number, month, etc.). However, real date formatting using this calendar is blocked until CLDR adds necessary support for formatting Chinese calendar dates. In doing this, we need to take into account other calendars that may have similar issues, which we should support in a unified way. The intent here is to provide the minimum change necessary to support the Chinese calendar (and other luni-solar calendars) at the same level as other calendars are currently supported; support for additional special calendar features requiring significant enhancements to the ICU Calendar object (see below) is for future enhancements.
A. Relevant calendar features
Salient features of the Chinese calendar, and related features of other calendars:
Expand All @@ -12,7 +17,7 @@ Earthly branches: 子 zǐ, 丑 chǒu, 寅 yín, 卯 mǎo, …
Zodiac animals: 鼠 Rat, 牛 Ox, 虎 Tiger, 兔 Rabbit, …
First years of 60-year cycle: 甲子 jiǎ-zǐ, 乙丑 yǐ-chǒu, 丙寅 bǐng-yín, 丁卯 dīng-mǎo, …
In principle each cycle can be treated as a separate era. However, such eras are not normally ever used in formatted dates, leading to potential ambiguity about which date is being represented. Traditionally this ambiguity could be resolved by also displaying a regnal period or regnal year along with the Chinese calendar date. In modern times this ambiguity is normally resolved by always displaying a Chinese calendar date in conjunction with a date (or at least a year) in at least one other calendar. In Taiwan this other calendar is typically the Minguo/ROC calendar; in Japan it is typically the Japanese calendar; in mainland China and elsewhere it is typically the Gregorian calendar (for a format like “y年U年MMMd日” where y is the Gregorian year and U is the stem-branch name). Note that the year transitions of the associated calendar do not occur at the same time as the year transitions of the Chinese calendar.
There are at least two standard conventions for the epoch of the Chinese calendar — i.e. when was year 1 of era 1. Both are associated with the legendary emperor Huangdi 黃帝, hence the "Huangdi era" 黃帝紀元. The most common convention is to use the beginning of Huangdi's reign, commonly specified as 2697 BCE; a somewhat less common convention (and the one used by ICU) is to use the year when he supposedly invented the Chinese calendar, 2637 BCE. Since the latter is 60 years later, the stem-branch names associated with years do not change, but the cycle number is different. For some usages among calendar specialists Chinese calendar years may be numbered continuously from the beginning of the epoch, in which case Gregorian 2012 Jan. 23 is the beginning of Chinese calendar year 4650 or 4710 depending on which convention is used. However this kind of year numbering is not widely known.
There are at least two standard conventions for the epoch of the Chinese calendar — i.e. when was year 1 of era 1. Both are associated with the legendary emperor Huangdi 黃帝, hence the "Huangdi era" 黃帝紀元. The most common convention is to use the beginning of Huangdi's reign, commonly specified as 2697 BCE; a somewhat less common convention (and the one used by ICU) is to use the year when he supposedly invented the Chinese calendar, 2637 BCE. Since the latter is 60 years later, the stem-branch names associated with years do not change, but the cycle number is different. For some usages among calendar specialists Chinese calendar years may be numbered continuously from the beginning of the epoch, in which case Gregorian 2012 Jan. 23 is the beginning of Chinese calendar year 4650 or 4710 depending on which convention is used. However this kind of year numbering is not widely known.
In Chinese the days of the month have special numbering. Days 1-10 use 初一, 初二, … 初十. For days 21-29 the number is formed using 廿 instead of 二十 to indicate 20. The first month is designated 正月 instead of 一月.
2. Other calendars related to the Chinese calendar (Japanese, Korean, Vietnamese)
Similar luni-solar calendars are used in Japanese, Korean, and Vietnamese, with the computations based respectively on meridians near Tokyo, Seoul, and Hanoi. For the Japanese version, the date typically used for disambiguation would be a Japanese calendar date, not a Gregorian date. The Vietnamese calendar uses a different set of animals for the branch names in years, and the marker for intercalary month is inserted *after* the month name, not before.
Expand Down Expand Up @@ -54,7 +59,7 @@ E. Current CLDR support
CLDR currently provides the following:
1. yeartype attribute
The yeartype attribute for month name elements allows an alternate month name to be selected for leap years (current legal values are just “standard”—the default—and “leap”). It is only used for the Hebrew calendar, as follows:
<month type="5">Shevat</month> <month type="6">Adar I</month> <month type="7">Adar</month> <month type="7" yeartype="leap">Adar II</month>
<month type="5">Shevat</month> <month type="6">Adar I</month> <month type="7">Adar</month> <month type="7" yeartype="leap">Adar II</month>
This works with the normal MMM+/LLL+ pattern characters for months; the choice of which name to use is managed by ICU date formatting code.
Note that this yeartype month is currently mapped into ICU month name data as the 14th element in the array of Hebrew month names, which seems a bit hacky.
2. special pattern character ‘l’
Expand All @@ -68,7 +73,7 @@ F. Proposal
Items 1-2 and 5-8 below are probably do-able for CLDR 21 and ICU 49. The others may come later.
1. ICU behavior for months
The Hebrew model of explicitly numbering all month names and skipping leap months in non-leap years does not work well for calendars like Chinese and Hindu that may insert leap months anywhere (and may combine months, etc.). The use of the UCAL_IS_LEAP_MONTH field is better suited to this.
For choosing the correct month name variant, I had proposed the idea of enhancing the UCAL_IS_LEAP_MONTH field to have 4 values, and adding an enum for these values:
For choosing the correct month name variant, I had proposed the idea of enhancing the UCAL_IS_LEAP_MONTH field to have 4 values, and adding an enum for these values:
normal month, this is currently value 0 for UCAL_IS_LEAP_MONTH
leap month (for Chinese, this has the same month number as the month before; for Hindu & TIbetan, it has the same number as the month after), this is currently value 1 for UCAL_IS_LEAP_MONTH
normal month after leap month (needed for Hindu & Tibetan); this could be value -1 for UCAL_IS_LEAP_MONTH (it is not a leap month, but does need a special name)
Expand All @@ -85,16 +90,15 @@ Current idea
Alongside the <months> element, permit an optional parallel element <monthPatterns> (only present for calendars that need it). The structure under this is similar to that for <months>, except that:
The <monthPatternContext> element's type attribute that takes one of three values: "format", "stand-alone", or the added "numeric" (pattern to use with numeric months).
The <monthPatternWidth> element's type attribute can take an additional value "all" for use with the "numeric" context (since there is no width distinction for numeric months).
The <monthPattern> elements can have type "leap", "standardAfterLeap", or "combined"; the value is the pattern used for modifying the month name(s) to indicate that month type.
A Chinese calendar example (marker before the month name) in root:
<monthPatterns> <monthPatternContext type="format"> <monthPatternWidth type="abbreviated"> (default alias to format/wide) </monthPatternWidth> <monthPatternWidth type="narrow"> (default alias to stand-alone/narrow) </monthPatternWidth> <monthPatternWidth type="wide"> <monthPattern type=”leap”>{0}bis</monthPattern> </monthPatternWidth> </monthPatternContext> <monthPatternContext type="stand-alone"> <monthPatternWidth type="abbreviated"> (default alias to format/abbreviated) </monthPatternWidth> <monthPatternWidth type="narrow"> <monthPattern type=”leap”>{0}bis</monthPattern> </monthPatternWidth> <monthPatternWidth type="wide"> (default alias to format/wide) </monthPatternWidth> </monthPatternContext> <monthPatternContext type="numeric"> <monthPatternWidth type="all"> <monthPattern type=”leap”>{0}bis</monthPattern> </monthPatternWidth> </monthPatternContext> </monthPatterns>
The <monthPattern> elements can have type "leap", "standardAfterLeap", or "combined"; the value is the pattern used for modifying the month name(s) to indicate that month type. A Chinese calendar example (marker before the month name) in root:
<monthPatterns> <monthPatternContext type="format"> <monthPatternWidth type="abbreviated"> (default alias to format/wide) </monthPatternWidth> <monthPatternWidth type="narrow"> (default alias to stand-alone/narrow) </monthPatternWidth> <monthPatternWidth type="wide"> <monthPattern type=”leap”>{0}bis</monthPattern> </monthPatternWidth> </monthPatternContext> <monthPatternContext type="stand-alone"> <monthPatternWidth type="abbreviated"> (default alias to format/abbreviated) </monthPatternWidth> <monthPatternWidth type="narrow"> <monthPattern type=”leap”>{0}bis</monthPattern> </monthPatternWidth> <monthPatternWidth type="wide"> (default alias to format/wide) </monthPatternWidth> </monthPatternContext> <monthPatternContext type="numeric"> <monthPatternWidth type="all"> <monthPattern type=”leap”>{0}bis</monthPattern> </monthPatternWidth> </monthPatternContext> </monthPatterns>
And in the Chinese locale:
<monthPatterns> <monthPatternContext type="format"> <monthPatternWidth type="wide"> <monthPattern type=”leap”>闰{0}</monthPattern> </monthPatternWidth> </monthPatternContext> <monthPatternContext type="stand-alone"> <monthPatternWidth type="narrow"> <monthPattern type=”leap”>闰{0}</monthPattern> </monthPatternWidth> </monthPatternContext> <monthPatternContext type="numeric"> <monthPatternWidth type="all"> <monthPattern type=”leap”>闰{0}</monthPattern> </monthPatternWidth> </monthPatternContext> </monthPatterns>
<monthPatterns> <monthPatternContext type="format"> <monthPatternWidth type="wide"> <monthPattern type=”leap”>闰{0}</monthPattern> </monthPatternWidth> </monthPatternContext> <monthPatternContext type="stand-alone"> <monthPatternWidth type="narrow"> <monthPattern type=”leap”>闰{0}</monthPattern> </monthPatternWidth> </monthPatternContext> <monthPatternContext type="numeric"> <monthPatternWidth type="all"> <monthPattern type=”leap”>闰{0}</monthPattern> </monthPatternWidth> </monthPatternContext> </monthPatterns>
For other calendars, the <monthPattern> elements above could be replaced by others such as the following:
For the Hebrew calendar, in the Hebrew locale, one could have (for Adar I and II):
<monthPattern type=”leap”>{0} א׳</monthPattern> <monthPattern type=”standardAfterLeap”>{0} ב׳</monthPattern>
<monthPattern type=”leap”>{0} א׳</monthPattern> <monthPattern type=”standardAfterLeap”>{0} ב׳</monthPattern>
For the Hindu calendar, in root (for a combined month, the name will be an affix plus a combination of two month names):
<monthPattern type=”leap”>adhik {0}</monthPattern> <monthPattern type=”standardAfterLeap”>nija {0}</monthPattern> <monthPattern type=”combined”>kshay {0}-{1}</monthPattern>
<monthPattern type=”leap”>adhik {0}</monthPattern> <monthPattern type=”standardAfterLeap”>nija {0}</monthPattern> <monthPattern type=”combined”>kshay {0}-{1}</monthPattern>
For the time being, at least, I don't think that we need to present this in the Survey Tool, and that may prove too complex and confusing anyway.
3. Month name styles
(mostly about data, some ideas for future structure requirements):
Expand All @@ -116,14 +120,14 @@ If it occurs in a pattern it should be ignored.
Option 1, <years> element
(The following was originally agreed in CLDR 2011-11-16; however, it has been superseded by option 2, which was approved on 2011-11-30).
Add a <years> element and sub-elements parallel to the current structure for <months>, <days>, and <quarters>, as follows (with similar structure in ICU):
<years> <yearContext type=”format”> <yearWidth type=”abbreviated”> <year type="1">Jia-Zi</month> <year type="2">Yi-Chou</month> <year type="60">Gui-Hai</month> </yearWidth> <yearWidth type=”narrow”> (defaults to abbreviated) </yearWidth> <yearWidth type=”wide”> (defaults to abbreviated) </yearWidth> </yearContext> </years>
<years> <yearContext type=”format”> <yearWidth type=”abbreviated”> <year type="1">Jia-Zi</month> <year type="2">Yi-Chou</month> <year type="60">Gui-Hai</month> </yearWidth> <yearWidth type=”narrow”> (defaults to abbreviated) </yearWidth> <yearWidth type=”wide”> (defaults to abbreviated) </yearWidth> </yearContext> </years>
Only the “format” context would be supported initially; other contexts could be added if needed.
Option 2, <cyclicNames> element
(approved in CLDR meeting 2011-11-30)
As noted above, the cycle of 60 stem-branch names is used for months and days as well as years. Years as are also known according to the cycle of 12 zodiac animals associated with the branch portion of the stem-branch name. A cycle of 12 branch names is also used for subdivisions of a day. Thus, it would be beneficial to have a more general representation of such name cycles, even though cyclic names for months, days, and day subdivisions are not part of the current proposal.
In one of his comments on #1507, Philippe Verdy mentions that the cycle of 60 names is also used for some non-calendrical enumerations in Chinese such as measurement of angles, and suggests that data for this should be independent of the calendar structure. These notions are specific to the Chinese locale, and are not notions that CLDR would support across multiple locales (unlike the Chinese calendar, which is supported across multiple locales), so it probably does not make sense to add CLDR structure for them.
The following proposes a ways to support cyclic names for years, zodiac mappings, months, days, and dayParts (not really the same as dayPeriods), with the currently-known cycles of length 60 or 12 (for the Chinese, Hindu, and related calendars); this structure would be just below the <calendar> element:
<cyclicNameSets> <cyclicNameSet type="years"> <cyclicNameContext type=”format”> <cyclicNameWidth type=”abbreviated”> <cyclicName type="1">jia-zi</month> <cyclicName type="2">yi-chou</month> … <cyclicName type="60">gui-hai</month> </cyclicNameWidth> < cyclicNameWidth type=”narrow”> (defaults to abbreviated) </cyclicNameWidth> < cyclicNameWidth type=”wide”> (defaults to abbreviated) </cyclicNameWidth> </cyclicNameContext> </cyclicNameSet> <cyclicNameSet type="months"> (root aliases to years) </cyclicNameSet> <cyclicNameSet type="days"> (root aliases to dayParts) </cyclicNameSet> <cyclicNameSet type="dayParts"> …data for branch names... </cyclicNameSet> <cyclicNameSet type="zodiacs"> (root aliases to dayParts, some locales will supply separate data) </cyclicNameSet> </cyclicNameSets>
<cyclicNameSets> <cyclicNameSet type="years"> <cyclicNameContext type=”format”> <cyclicNameWidth type=”abbreviated”> <cyclicName type="1">jia-zi</month> <cyclicName type="2">yi-chou</month> … <cyclicName type="60">gui-hai</month> </cyclicNameWidth> < cyclicNameWidth type=”narrow”> (defaults to abbreviated) </cyclicNameWidth> < cyclicNameWidth type=”wide”> (defaults to abbreviated) </cyclicNameWidth> </cyclicNameContext> </cyclicNameSet> <cyclicNameSet type="months"> (root aliases to years) </cyclicNameSet> <cyclicNameSet type="days"> (root aliases to dayParts) </cyclicNameSet> <cyclicNameSet type="dayParts"> …data for branch names... </cyclicNameSet> <cyclicNameSet type="zodiacs"> (root aliases to dayParts, some locales will supply separate data) </cyclicNameSet> </cyclicNameSets>
As with the leap month data, this may not be appropriate for the Survey Tool.
7. New pattern character(s)
We would need to add a pattern character to indicate year name. A natural choice is ‘U’ since it is currently unused and ‘u’ is already used for a different year type.
Expand All @@ -133,7 +137,7 @@ Parsing (month names, year names)... (to be supplied)
ICU4J ChineseDateFormat class, move relevant behaviors into SimpleDateFormat, leaving this as mostly a shell. Remove ChineseDateFormatSymbols use of "isLeapMonth" resource; instead derive the necessary data (needed only for backwards compatibility) from the monthPatterns data.
9. ICU API enhancements
Add a calendar field IS_EXTRA_HOUR or IS_REPEATED_HOUR to disambiguate the hour added/repeated during DST transitions that set the clock back.
Work out how and whether to map the modified month names (for leap month types) onto APIs that get date format symbols — use additional options to specify month symbol types? What about symbols for year names?
Work out how and whether to map the modified month names (for leap month types) onto APIs that get date format symbols — use additional options to specify month symbol types? What about symbols for year names?
Add Calendar API to answer the following questions for a given year and era:
Is it a leap year? And if so…
Of what type - does it adjust days or months?
Expand Down
Loading

0 comments on commit 4344afc

Please sign in to comment.