CLDR-17566 Converting Translation Guide and Downloads (unicode-org#4013)

haytenf · Sep 17, 2024 · a2a0182 · a2a0182
1 parent dbd837a
commit a2a0182
Show file tree

Hide file tree

Showing 7 changed files with 566 additions and 0 deletions.
diff --git a/docs/site/images/index/APIIntegration.png b/docs/site/images/index/APIIntegration.png
diff --git a/docs/site/images/index/cldrGrowthChart.png b/docs/site/images/index/cldrGrowthChart.png
diff --git a/docs/site/images/index/growth44.png b/docs/site/images/index/growth44.png
diff --git a/docs/site/index/downloads/cldr-43.md b/docs/site/index/downloads/cldr-43.md
diff --git a/docs/site/index/downloads/cldr-44.md b/docs/site/index/downloads/cldr-44.md
diff --git a/docs/site/translation/translation-guide-general/capitalization.md b/docs/site/translation/translation-guide-general/capitalization.md
@@ -0,0 +1,32 @@
+---
+title: Capitalization
+---
+
+# Capitalization
+
+Beginning with CLDR 22, the guidance is that names of items such as languages, regions, calendar and collation types, as well as names of months and weekdays in calendar data and the names of calendar fields, should be capitalized as appropriate for the middle of body text (except possibly for narrow forms, see note below).
+
+Regarding the capitalization of months and weekdays, please apply middle\-of\-sentence capitalization rules even on stand\-alone items.
+
+**In your language, if month and day names are generally lower case in the middle of the sentence, then please apply this same rule (lower case) to both formatting and standalone values.**
+
+In your language, if month and day names are generally upper case in the middle of the sentence, then please apply the same rule (upper case) to the standalone values.
+
+The primary reason for having both format and stand\-alone forms is to handle any necessary grammatical distinctions (rather than capitalization distinctions).
+
+- Stand\-alone month names are intended to be used without a day\-of\-month number
+- Format month names are intended to be used with a day\-of\-month number.
+
+In many languages, that means that the stand\-alone month names should be in nominative form, while the format month names should be in genitive or a related form.
+
+In this case, date formats will also reflect that, using the format form MMMM in a format such as “d MMMM y”, and the stand\-alone form LLLL in a format such as “LLLL y”.
+
+**Note:** Narrow forms for items such as month and day names are typically too short to reflect differences between grammatical forms. For capitalization purposes, format narrow names should be capitalized according to the normal conventions for their use in running text, and stand\-alone narrow names should be capitalized according to conventions for stand\-alone use.
+
+The new \<contextTransforms\> element now indicates how to change the capitalization for use in a menu, or for stand\-alone use such as in the title of a calendar page (the \<contextTransforms\> data cannot currently be edited in the Survey Tool; please file a bug for any necessary changes).
+
+However, it is also important to ensure that there is consistent casing for all of the items in a section, so before making any changes, be sure to get agreement among all the translators for your language—otherwise the capitalization of items in a section may appear random.
+
+To provide warnings when the capitalization of an item differs from what is intended for items in a given category, the Survey Tool now checks capitalization of items against the \<casingData\> within the \<metadata\> element; data for this comes from xml files in the CLDR common/casing/ directory. This data cannot be changed using the Survey Tool; if it is incorrect, please file a bug (initial data was created based on the predominant capitalization of items in each category within a locale, and may be wrong).
+
+![Unicode copyright](https://www.unicode.org/img/hb_notice.gif)
diff --git a/docs/site/translation/translation-guide-general/default-content.md b/docs/site/translation/translation-guide-general/default-content.md
@@ -0,0 +1,38 @@
+---
+title: Default Content
+---
+
+# Default Content
+
+Locales are primarily identified by their ***base*** language. For example, English \[en], Arabic \[ar] or German \[de]. 
+
+We also label scripts explicitly, where a language is typically written in multiple scripts, such as Cyrillic or Latin. For example, Serbian (Cyrillic) \[sr\_Cyrl] and Serbian (Latin) \[sr\_Latn].
+
+Each language \+ script combination is treated as a unit. (i.e. People do not mix different script in the same data set.) 
+
+If a language is ***not*** typically written in multiple scripts, then the script sub\-tag is omitted. For example, en\_US or ko\_KR.
+
+Locales may also have regional variants. For example, English (US) \[en\_US] vs English (UK) \[en\_GB], or Serbian (Cyrillic, Montenegro) \[sr\_Cyrl\_ME] vs Serbian (Cyrillic, Serbia) \[sr\_Cyrl\_RS]. Regions may be countries such as China \[CN], parts of countries such as Hong Kong \[HK] or multi\-country regions such as Latin America \[419]. Also see [Regional Variants](http://cldr.unicode.org/translation/getting-started/guide#TOC-Regional-Variants-also-known-as-Sub-locales-).
+
+The contents for the base language should be as widely usable (neutral) as possible, but **must be** usable without modification for its *default content locale;* this is the locale for the language’s *default region,* which is typically the region with the most speakers of the language. A default content locale has no data other than identity information, it inherits all data from its parent.
+
+For example:
+
+- American English \[en\_US] is the default content locale for English \[en]
+- German (Germany) \[de\_DE] is the default content locale for German \[de].
+- Portuguese (Brazil) \[pt\_BR] is the default content locale for Portuguese \[pt]
+- Serbian (Cyrillic) \[sr\_Cyrl] is the default content locale for Serbian \[sr], which is the default for Serbian (Cyrillic, Seriba) \[sr\_Cyrl\_RS] .
+- Arabic (World) \[ar\_001] is the default content locale for Arabic \[ar], which is for Modern Standard Arabic.
+
+**Tips for linguists:**
+
+1. Make sure the base language content is correct; as widely usable (neutral) as possible, but must be usable **without** modification in the default content locale.
+2. For example:
+	- English \[en] locale content must be usable for English (US)
+	- Arabic \[ar] content must be usable for Arabic (world/neutral).
+3. Make sure that where there is a difference in a sub\-region, the differences are represented in the regional\-variant locale.
+4. For example:
+	- Spanish (Mexico) \[es\_MX] differences from Spanish (Latin America) \[es\_419]
+	- Arabic (Egypt) \[ar\_EG] that are different from Arabic (World) \[ar\_001]
+
+![Unicode copyright](https://www.unicode.org/img/hb_notice.gif)