diff --git a/docs/site/covered-by-other-projects.md b/docs/site/covered-by-other-projects.md index c535baa0e69..01fae6cb74f 100644 --- a/docs/site/covered-by-other-projects.md +++ b/docs/site/covered-by-other-projects.md @@ -10,4 +10,3 @@ CLDR covers many different types of data, but not everything. Here are some info |---|---|---|---| | libphonenumber | https://opensource.google.com/projects/libphonenumber | https://unicode-org.atlassian.net/browse/CLDR-188 | Phone Number database | -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/ddl.md b/docs/site/ddl.md index 3d94769ce84..04052e88a5e 100644 --- a/docs/site/ddl.md +++ b/docs/site/ddl.md @@ -14,4 +14,3 @@ Contributors for Digitally Disadvantaged Languages (DDL) face unique challenges. The DDL Subcommittee has started to meet every other week as of June, 2023. -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/development.md b/docs/site/development.md index 6b3107bbb66..3433314d3c4 100644 --- a/docs/site/development.md +++ b/docs/site/development.md @@ -4,4 +4,4 @@ title: Internal Development # Internal Development -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) + diff --git a/docs/site/development/adding-locales.md b/docs/site/development/adding-locales.md index 176974acbed..464a8a5089b 100644 --- a/docs/site/development/adding-locales.md +++ b/docs/site/development/adding-locales.md @@ -40,4 +40,3 @@ Here is an example: https://github.com/unicode-org/cldr/pull/59/files - Commit your work to a branch and create a Pull Request. - The new locale will be included in Smoketest when the PR is merged, and will be in production once a push to production occurs. -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/development/cldr-big-red-switch.md b/docs/site/development/cldr-big-red-switch.md index f25bf979637..68d35af128b 100644 --- a/docs/site/development/cldr-big-red-switch.md +++ b/docs/site/development/cldr-big-red-switch.md @@ -28,4 +28,4 @@ However, names are not automatically entered there, since some people may not wi 2. e\-mail that list **on BCC:** the above message with a subject line of "\[CLDR X.Y Contributor Message]", and a request to please keep the subject line intact. 3. Then, the subject line can be used to filter/locate the contributor requests. -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) + diff --git a/docs/site/development/cldr-big-red-switch/generating-charts.md b/docs/site/development/cldr-big-red-switch/generating-charts.md index cc2136b6d94..4de770558bb 100644 --- a/docs/site/development/cldr-big-red-switch/generating-charts.md +++ b/docs/site/development/cldr-big-red-switch/generating-charts.md @@ -64,4 +64,3 @@ The messages that they use are in a file util/data/chart\_messages.html. The rig The key will be zone\_tzid, in this case. -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/development/cldr-development-site.md b/docs/site/development/cldr-development-site.md index 6a99cd9d1cb..0bb27878765 100644 --- a/docs/site/development/cldr-development-site.md +++ b/docs/site/development/cldr-development-site.md @@ -8,14 +8,14 @@ Some of the key pages for developers are: 1. [New CLDR Developers](https://cldr.unicode.org/development/new-cldr-developers) 1. [Maven Setup](https://cldr.unicode.org/development/maven) (for command line & Eclipse) - 1. Obsolete (but may still contain useful nuggets): [Eclipse Setup](https://cldr.unicode.org/development/eclipse-setup) - 2. [Eclipse](https://cldr.unicode.org/development/running-survey-tool/building-and-running-the-survey-tool-on-eclipse) (survey tool) + 1. Obsolete (but may still contain useful nuggets): [Eclipse Setup](https://cldr.unicode.org/development/eclipse-setup) + 2. [Eclipse](https://cldr.unicode.org/development/running-survey-tool/building-and-running-the-survey-tool-on-eclipse) (survey tool) 2. [Handling Tickets (bugs/enhancements)](https://cldr.unicode.org/development/development-process) 3. [Updating DTDs](https://cldr.unicode.org/development/updating-dtds) 4. [Editing CLDR Spec](https://cldr.unicode.org/development/editing-cldr-spec) 1. [CLDR: Big Red Switch](https://cldr.unicode.org/development/cldr-big-red-switch) (checklist for release) 5. [Adding a new locale to CLDR](https://cldr.unicode.org/development/adding-locales) - + The subpages listed give more information on internal CLDR development. See also: [Sitemap](https://sites.google.com/site/cldr/system/app/pages/sitemap/hierarchy). @@ -27,4 +27,3 @@ style="\[^"\]\*" Also see the [Google Docs to Markdown extension, by edbacher](https://workspace.google.com/marketplace/app/docs_to_markdown/700168918607) -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/development/cldr-development-site/running-cldr-tools.md b/docs/site/development/cldr-development-site/running-cldr-tools.md index 044d4317b96..ec0e46529b8 100644 --- a/docs/site/development/cldr-development-site/running-cldr-tools.md +++ b/docs/site/development/cldr-development-site/running-cldr-tools.md @@ -20,7 +20,7 @@ You will need to include some options to run various programs. Here are some sam \-Dregistry\=language\-subtag\-registry -\-DSHOW\_FILES +\-DSHOW\_FILES The xmx is to increase memory so that you don't blow up. If you only do a few dozen locales, you don't need to set it that high. @@ -31,4 +31,3 @@ The xmx is to increase memory so that you don't blow up. If you only do a few do \-DSHOW\_FILES // shows files being opened and created -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/development/cldr-development-site/updating-englishroot.md b/docs/site/development/cldr-development-site/updating-englishroot.md index 74e184e3da4..21fcfdeeb0b 100644 --- a/docs/site/development/cldr-development-site/updating-englishroot.md +++ b/docs/site/development/cldr-development-site/updating-englishroot.md @@ -22,11 +22,11 @@ The tool is in tools/java/org/unicode/cldr/tool/GenerateBirth.java. It requires 2. The archive directory should have the latest version of every major and minor version (where versions before 21\.0 have the major version split across the top two fields). 3. You will probably need to modify both CldrVersion.java and ToolConstants.java to bring them up to date. -**log (set with \-l \, default\=CldrUtility.UTIL\_DATA\_DIR, set with CLDR\_DIR** +**log (set with \-l \, default\=CldrUtility.UTIL\_DATA\_DIR, set with CLDR\_DIR** Pass an argument for \-t to specify the output directory. Takes a few minutes to run (and make sure you have set Java with enough memory)! -The tool generates (among other things) the following two binary files (among others) in the output directory specified with \-t: +The tool generates (among other things) the following two binary files (among others) in the output directory specified with \-t: - **outdated.data** - **outdatedEnglish.data** @@ -60,7 +60,7 @@ Make sure TestOutdatedPaths.java passes. It may take some modifications, since i Run TestCheckCLDR and TestBasic with the option **\-prop:logKnownIssue\=false** (that option is important!). This checks that the Limited Submission is set up properly and that SubmissionLocales are correct. - + If you run into any problems, look below at debugging. @@ -72,7 +72,7 @@ Eg https://github.com/unicode-org/cldr/pull/243 It also generates readable log files for double checking. These will be in {workspace}/cldr\-aux/births/\/, that is: CLDRPaths.AUX\_DIRECTORY \+ "births/" \+ trunkVersion. Examples: https://unicode.org/repos/cldr-aux/births/35.0/en.txt, https://unicode.org/repos/cldr-aux/births/35.0/fr.txt. -Their format is the following (TSV \= tab\-delimited\-values) — to view, it is probably easier to copy the files into a spreadsheet. +Their format is the following (TSV \= tab\-delimited\-values) — to view, it is probably easier to copy the files into a spreadsheet. - English doesn't have the E... values, but is a complete record. - Other languages only have lines where the English value is more recently changed (younger) than the native’s. @@ -94,4 +94,3 @@ Their format is the following (TSV \= tab\-delimited\-values) — to view, it is A value of � indicates that there is no value for that version. -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/development/coding-cldr-tools/documenting-cldr-tools.md b/docs/site/development/coding-cldr-tools/documenting-cldr-tools.md index d206f6cbc8f..d725333ebae 100644 --- a/docs/site/development/coding-cldr-tools/documenting-cldr-tools.md +++ b/docs/site/development/coding-cldr-tools/documenting-cldr-tools.md @@ -43,4 +43,3 @@ Additional parameters: Assuming your tools’s alias is *myalias,* create a new subpage with the URL http://cldr.unicode.org/tools/myalias (a subpage of [CLDR Tools](https://cldr.unicode.org/development/cldr-tools)). Fill this page out with information about how to use your tool. -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/development/creating-the-archive.md b/docs/site/development/creating-the-archive.md index 8909967950c..89726127743 100644 --- a/docs/site/development/creating-the-archive.md +++ b/docs/site/development/creating-the-archive.md @@ -43,10 +43,10 @@ A number of the tools in CLDR depend on access to older versions. These tools in 4. Now, run the tool **org.unicode.cldr.tool.CheckoutArchive** - Or from the command line:
**mvn \-DCLDR\_DIR\=** *path\_to/cldr* **\-\-file\=tools/pom.xml \-pl cldr\-code compile \-DskipTests\=true exec:java \-Dexec.mainClass\=org.unicode.cldr.tool.CheckoutArchive  \-Dexec.args\=""** - - Note other options for this tool: -   *\-\-help* will give help -   *\-\-prune* will run a 'git workspace prune' before proceeding -   *\-\-echo* will just show the commands that would be run, without running anything + - Note other options for this tool: +   *\-\-help* will give help +   *\-\-prune* will run a 'git workspace prune' before proceeding +   *\-\-echo* will just show the commands that would be run, without running anything (For example,  **\-Dexec.args\="\-\-prune"** in the above command line) The end result (where you need all of the releases) looks something like the following: @@ -58,4 +58,3 @@ The end result (where you need all of the releases) looks something like the fol - You can set the property  **\-DCLDR\_ARCHIVE** to point to a different parent directory for the archive - You can set **\-DCLDR\_HAS\_ARCHIVE\=false** to tell unit tests and tools not to look for the archive -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/development/development-process.md b/docs/site/development/development-process.md index d48ae78ae53..fb91d86899f 100644 --- a/docs/site/development/development-process.md +++ b/docs/site/development/development-process.md @@ -166,4 +166,4 @@ If there is a test failure that is due to a bug that cannot be fixed right now ( 1. The future folder tickets are moved to the discuss folder 2. Unscheduled tickets (with no release number) are re\-evaluated. -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) + diff --git a/docs/site/development/development-process/design-proposals.md b/docs/site/development/development-process/design-proposals.md index 43874f0ebb3..b065233d9d5 100644 --- a/docs/site/development/development-process/design-proposals.md +++ b/docs/site/development/development-process/design-proposals.md @@ -114,4 +114,3 @@ In each proposal, please add a header and a TOC if it is longer than a page. You [XMB](https://cldr.unicode.org/development/development-process/design-proposals/xmb) -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/development/development-process/design-proposals/alternate-time-formats.md b/docs/site/development/development-process/design-proposals/alternate-time-formats.md index cfc288219d1..75fbdb9222a 100644 --- a/docs/site/development/development-process/design-proposals/alternate-time-formats.md +++ b/docs/site/development/development-process/design-proposals/alternate-time-formats.md @@ -9,7 +9,7 @@ This design proposal is intended to solve the problem that sometimes the desired \ The numbers attribute is used to indicate that numeric quantities in the pattern are to be rendered using a numbering system other than then default numbering system defined for the given locale. The attribute can be in one of two forms. If the alternate numbering system is intended to apply to ALL numeric quantities in the pattern, then simply use the numbering system ID as found in Section C.13 [Numbering Systems](http://www.unicode.org/reports/tr35/#Numbering_Systems). To apply the alternate numbering system only to a single field, the syntax "\=\" can be used one or more times, separated by semicolons. - + Examples: \dd/mm/yyyy\ @@ -28,4 +28,3 @@ Examples: In addition to the syntax, allow symbol or string replacements of the form "\=\=\" -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/development/development-process/design-proposals/bcp-47-changes-draft.md b/docs/site/development/development-process/design-proposals/bcp-47-changes-draft.md index fa7afbd59f7..c26c3eeaf03 100644 --- a/docs/site/development/development-process/design-proposals/bcp-47-changes-draft.md +++ b/docs/site/development/development-process/design-proposals/bcp-47-changes-draft.md @@ -72,7 +72,7 @@ Macrolanguage Table | Kurdish 'ku' | Northern Kurdish 'kmr'? | We probably want to change the default content locale to ku-Latn | | Akan ' ak ' | Twi ' tw ' and Fanti ' fat' | This appears to be a mistake in ISO 639. See: ISO 636 Deprecation Requests . | | Persian fas (fa) | Western Farsi pes and prs Dari | This appears to be a mistake in ISO 639. See: ISO 636 Deprecation Requests . | - + These would also go into the \ element of the supplemental metadata. We may add more such aliases over time, as we find new predominant forms. Note that we still need to offer both aliases for translation in many cases. For example, we want to show both 'no' and 'nb'. ## Lenient Parsing @@ -90,7 +90,7 @@ There are many circumstances where we get less than perfect language identifiers 2. Remove the base language 3. This avoids having to store which languages are also extlangs, and what their prefixes are. -People have to do #1. We should recommend #2, and make it easy to support #3. +People have to do #1. We should recommend #2, and make it easy to support #3. See demo at [http://unicode.org/cldr/utility/languageid.jsp](http://unicode.org/cldr/utility/languageid.jsp) @@ -211,4 +211,3 @@ The languages are listed in the format Abkhazian [ab]-OR, where [xx] is the code - Yao [yao]-U, Yiddish [yi]-U, Yupik Language [ypk]-U - Zande [znd]-U, Zapotec [zap]-U, Zaza [zza]-U, Zenaga [zen]-U, Zuni [zun]-U -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/development/development-process/design-proposals/bcp47-syntax-mapping.md b/docs/site/development/development-process/design-proposals/bcp47-syntax-mapping.md index 12c0646299b..de224750715 100644 --- a/docs/site/development/development-process/design-proposals/bcp47-syntax-mapping.md +++ b/docs/site/development/development-process/design-proposals/bcp47-syntax-mapping.md @@ -215,4 +215,3 @@ The current CVS snapshot implementation uses CSS3 names. This proposal changes a CLDR uses Olson tzids. These IDs are usually made from \+"/"+\ and relatively long. To satisfy the syntax requirement discussed in this document, we need to map these IDs to relatively short IDs uniquely. The UN LOCODE is designed to assign unique location code and it satisfies most of the requirement. A LOCODE consists from 2 letter ISO country code and 3 letter location code. This proproposal suggest that a 5 letter LOCODE is used as a short time zone ID if examplar city has a exact match in LOCODE repertoire. Some Olson tzids do not have direct mapping in LOCODE. In this case, we assign our own codes to them, but using 3-4/6-8 letter code to distinguish them from LOCODE. For Olson tzid Etc/GMT\*, this proposal suggest "UTC" + ["E" | "W"] + nn (hour offset), for example, UTCE01 means 1 hour east from UTC (Etc/GMT-1). The proposed short ID list is attached in this [document](https://drive.google.com/file/d/1O9B_hO6uD4m7dtb-hU9euBkgP8nQxJ9X/view?usp=sharing). -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/development/development-process/design-proposals/bcp47-validation-and-canonicalization.md b/docs/site/development/development-process/design-proposals/bcp47-validation-and-canonicalization.md index 62239abcbb1..4ecf26c8733 100644 --- a/docs/site/development/development-process/design-proposals/bcp47-validation-and-canonicalization.md +++ b/docs/site/development/development-process/design-proposals/bcp47-validation-and-canonicalization.md @@ -31,7 +31,7 @@ We also provide data for validation and canonicalization. The basic canonicaliza 1. We canonicalize the case, with variants getting uppercase, so en\_foobar => en\_FOOBAR 2. We alphabetize the variants so that irrelevant differences in order don't cause problems, so en-FOOBAR-ABCDE => en\_ABCDE\_FOOBAR - - Note: the uppercasing of variants is for compatibility, since the basis for the CLDR work predated BCP47. + - Note: the uppercasing of variants is for compatibility, since the basis for the CLDR work predated BCP47. Data for doing the preferred value mapping is in the supplemental data, extracted from the IANA registry. @@ -145,4 +145,3 @@ Here is the data that they replace: \ -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/development/development-process/design-proposals/bidi-handling-of-structured-text.md b/docs/site/development/development-process/design-proposals/bidi-handling-of-structured-text.md index e6100a0ebbf..8a6bc445e10 100644 --- a/docs/site/development/development-process/design-proposals/bidi-handling-of-structured-text.md +++ b/docs/site/development/development-process/design-proposals/bidi-handling-of-structured-text.md @@ -357,4 +357,3 @@ The following rules in ar.xml ( Arabic ): \ -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/development/development-process/design-proposals/change-to-sites.md b/docs/site/development/development-process/design-proposals/change-to-sites.md index 364b5fb31ee..c9ee9f94423 100644 --- a/docs/site/development/development-process/design-proposals/change-to-sites.md +++ b/docs/site/development/development-process/design-proposals/change-to-sites.md @@ -79,4 +79,3 @@ Possible Bug - needs investigation. ![image](../../../images/design-proposals/site_bug.png) -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/development/development-process/design-proposals/chinese-and-other-calendar-support-intercalary-months-year-cycles.md b/docs/site/development/development-process/design-proposals/chinese-and-other-calendar-support-intercalary-months-year-cycles.md index f99b75f25a8..5de16d1889b 100644 --- a/docs/site/development/development-process/design-proposals/chinese-and-other-calendar-support-intercalary-months-year-cycles.md +++ b/docs/site/development/development-process/design-proposals/chinese-and-other-calendar-support-intercalary-months-year-cycles.md @@ -74,7 +74,7 @@ Months are numbered 0-11 (the zero-based value of UCAL\_MONTH). When an intercal For purposes of add and set operations, month is treated as a tuple represented by UCAL\_MONTH and UCAL\_IS\_LEAP\_MONTH. If UCAL\_IS\_LEAP\_MONTH is 0 for a month that has a leap month following, then adding 1 month, or setting UCAL\_IS\_LEAP\_MONTH to 1, sets the calendar to the leap month (which has the same value for UCAL\_MONTH). If a month does not have a leap month following, then a set of UCAL\_IS\_LEAP\_MONTH to 1 is ignored. -Years are numbered 1-60 (the value of UCAL\_YEAR) for each 60-year cycle. The era is incremented for each 60-year cycle, so we are currently in era 78. +Years are numbered 1-60 (the value of UCAL\_YEAR) for each 60-year cycle. The era is incremented for each 60-year cycle, so we are currently in era 78. Current ICU4C formatting for the Chinese calendar is completely broken. For example, the short date format in root and zh is currently “y'x'G-Ml-d”; the result this produces for Chinese era 78, year 29, month 4 (non-leap or leap), day 2 is “29x-4-”: There is no era value or leap month indicator, and non-literal fields after the ‘l’ pattern character are skipped. @@ -86,7 +86,7 @@ In a non-leap year, months run 0-4 (for months Tishri-Shevat), skip 5 (“Adar I ### 3. Coptic and Ethiopic calendars -Months are numbered 0-12. +Months are numbered 0-12. ### 4. Other calendars listed above @@ -303,4 +303,3 @@ New tickets related to this, which supersede the above, are: - ICU [#9044](http://bugs.icu-project.org/trac/ticket/9044): Chinese cal dates can't always be parsed - document & fix tests - ICU [#9055](http://bugs.icu-project.org/trac/ticket/9055): Integrate Chinese cal pattern updates (cldrbug 4237), update tests -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/development/development-process/design-proposals/consistent-casing.md b/docs/site/development/development-process/design-proposals/consistent-casing.md index f75f3c7e097..38ab0d1da76 100644 --- a/docs/site/development/development-process/design-proposals/consistent-casing.md +++ b/docs/site/development/development-process/design-proposals/consistent-casing.md @@ -154,4 +154,3 @@ I have attached "CasingContextsV2.pdf" which fixes the calendar menu example (th Updated to "[CasingContextsV3.pdf](https://drive.google.com/file/d/1mvXlCSPhU87nl9owW_ZeCYHy_pJ-RqHL/view?usp=sharing)" which adds an overall explanation of the purpose of this document as well as instructions for localizers to provide feedback. -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/development/development-process/design-proposals/coverage-revision.md b/docs/site/development/development-process/design-proposals/coverage-revision.md index 8a3a4233ad1..8937f0c2547 100644 --- a/docs/site/development/development-process/design-proposals/coverage-revision.md +++ b/docs/site/development/development-process/design-proposals/coverage-revision.md @@ -62,4 +62,3 @@ So with this in mind, I would like to propose the following structure to be adde Finding the appropriate coverage level value would then be a matter of searching the coverageLevel entries in numeric order by value looking for a match of the path vs. "//ldml/" + "regular expression". In other words, we would not specifically include "//ldml" in the expressions, since they would all start with that. Once a given xpath's coverage level value was determined, it shouldn't be too hard for us to simply filter out fields whose coverage level was higher then the requested. I suppose that we will need some wildcards similar to what Mark has started working on in his path filtering proposal. -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/development/development-process/design-proposals/currency-code-fallback.md b/docs/site/development/development-process/design-proposals/currency-code-fallback.md index b240c511125..18527db90d7 100644 --- a/docs/site/development/development-process/design-proposals/currency-code-fallback.md +++ b/docs/site/development/development-process/design-proposals/currency-code-fallback.md @@ -31,4 +31,3 @@ I'm leaning towards #1, just for simplicity. However, see also: http://www.unicode.org/cldr/bugs/locale-bugs?findid=2244 -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/development/development-process/design-proposals/day-period-design.md b/docs/site/development/development-process/design-proposals/day-period-design.md index 62e402655a2..e8696821505 100644 --- a/docs/site/development/development-process/design-proposals/day-period-design.md +++ b/docs/site/development/development-process/design-proposals/day-period-design.md @@ -238,4 +238,3 @@ Code changes - We need to have a chart for the dayPeriodRules, like we do for plurals. - Add invariant testing in CheckAttributes. -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/development/development-process/design-proposals/delimiter-quotation-mark-proposal.md b/docs/site/development/development-process/design-proposals/delimiter-quotation-mark-proposal.md index b48468b593e..f4a374605fd 100644 --- a/docs/site/development/development-process/design-proposals/delimiter-quotation-mark-proposal.md +++ b/docs/site/development/development-process/design-proposals/delimiter-quotation-mark-proposal.md @@ -10,7 +10,7 @@ The following is a proposal for how to handle the delimiter issues in CLDR, rais There are qute a number of problems in our delimiter data. Most of these are in the ‘lesser vetted’ languages, but some are in major languages. Problems include use of ASCII quotes, and obvious inconsistencies in data. -The goal for the v49 release is to at least have consistent data, then have the translators look at the changes in the data submission phase for v50. +The goal for the v49 release is to at least have consistent data, then have the translators look at the changes in the data submission phase for v50. **Data Cleanup.** I went through the data, and cleaned up obvious problems (such as a generic ASCII quote on one side, and a curly quote on the other). I then compared it to the Wikipedia data where available; if there was a difference between CLDR and Wikipedia, I looked for original sources on the web. It is often a bit tricky, since there are be variant practices in many languages. The goal is to have it to be the most customary usage, but often it may not be clear which variant predominates. @@ -21,7 +21,7 @@ Note that for a great many locales (especially African ones), there isn’t much 1. Use the ASCII ugly quotes 2. Reverse the two, which would be correct with all BIDI or with correct markup. -Personally, I lean towards #2. +Personally, I lean towards #2. The first sheet has recommendations for changing; the others are just scratch sheets. @@ -32,4 +32,3 @@ The first sheet has recommendations for changing; the others are just scratch sh [**Quotation (Delimiter) Proposal**](https://docs.google.com/spreadsheets/d/1_7vjBSmjlmevIQfpM4xX1yMGYQdibd6h8PTrQEPzzkQ/edit?gid=2#gid=2) -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/development/development-process/design-proposals/english-inheritance.md b/docs/site/development/development-process/design-proposals/english-inheritance.md index b0a08285515..b01de274d0f 100644 --- a/docs/site/development/development-process/design-proposals/english-inheritance.md +++ b/docs/site/development/development-process/design-proposals/english-inheritance.md @@ -36,7 +36,7 @@ A few releases back, we created the en\_001 locale in CLDR, that was intended to - Use en\_001 as the basis for translation in the CLDR Survey Tool instead of en. - Make en\_CA (English - Canada) inherit from en\_001 instead of en. The en\_CA locale should be reviewed to make sure that items that previously were correctly inherited from "en" (such as well understood time zone abbreviations ) are copied into the en\_CA locale. - Make sure that proper time formats are in place for en\_XX locales, where XX is any country where 12 hour clock is customary according to CLDR's supplemental data. -- Review the inheritance table (below), making any necessary adjustments. It has been suggested that en\_ZA and en\_ZW should inherit from en\_GB instead of en\_001. +- Review the inheritance table (below), making any necessary adjustments. It has been suggested that en\_ZA and en\_ZW should inherit from en\_GB instead of en\_001. ### Reference: English locales and inheritance: @@ -134,4 +134,3 @@ A few releases back, we created the en\_001 locale in CLDR, that was intended to | en_ZM | Zambia | en_001 | | en_ZW | Zimbabwe | en_001 | -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/development/development-process/design-proposals/european-ordering-rules-issues.md b/docs/site/development/development-process/design-proposals/european-ordering-rules-issues.md index 9b6131228fa..11ed1935ae5 100644 --- a/docs/site/development/development-process/design-proposals/european-ordering-rules-issues.md +++ b/docs/site/development/development-process/design-proposals/european-ordering-rules-issues.md @@ -8,7 +8,7 @@ The European ordering rules feature is a new collation feature in CLDR which att A copy of a near-final draft (FprEN 13710:2010) is available to Unicode members in the [UTC document register](http://www.unicode.org/L2/L-curdoc.htm) (L2/14-143). -This document describes current issues in an attempt for us to have a clear picture of what level of EOR support will be contained within CLDR 21. +This document describes current issues in an attempt for us to have a clear picture of what level of EOR support will be contained within CLDR 21. Current Status @@ -35,7 +35,7 @@ Questions 1. Which EOR base to use? If EN13710 needs revisions, how do we make that happen? 2. Should we use Kent's modified rules as attached to http://unicode.org/cldr/trac/ticket/763 ? 3. What locales should provide EOR based tailorings? -4. Need to add EOR to BCP47. +4. Need to add EOR to BCP47. Choices @@ -43,4 +43,3 @@ Choices 2. Put in a fixed version ourselves 3. Put in the "stock" version, knowing about the problems. -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/development/development-process/design-proposals/extended-windows-olson-zid-mapping.md b/docs/site/development/development-process/design-proposals/extended-windows-olson-zid-mapping.md index b8ee6a21531..a6a9bcb2b3f 100644 --- a/docs/site/development/development-process/design-proposals/extended-windows-olson-zid-mapping.md +++ b/docs/site/development/development-process/design-proposals/extended-windows-olson-zid-mapping.md @@ -4,7 +4,7 @@ title: Extended Windows-Olson zid mapping # Extended Windows-Olson zid mapping -**This proposal was approved by the CLDR TC on 2012-01-11 with some minor updates. See update comments.** +**This proposal was approved by the CLDR TC on 2012-01-11 with some minor updates. See update comments.** ## Background @@ -70,7 +70,7 @@ metaZones.xml already contains multiple mapping per single meta zone by region l   \ -So we could use the same scheme for representing Windows-Olson mapping. For example, mapping data for Windows "Central America Standard Time" could be represented as below - +So we could use the same scheme for representing Windows-Olson mapping. For example, mapping data for Windows "Central America Standard Time" could be represented as below -   \ @@ -100,7 +100,7 @@ Design Option 1 - New attribute to indicate global/regional default Adding a new attribute "defaultfor" to \ element. The value of "defaultfor" attribute is either "all" or "region" -For example, the mapping data for Windows time zone (UTC-05:00) Esstern Time (US & Canada) (ID: Eastern Standard Time) look like below - +For example, the mapping data for Windows time zone (UTC-05:00) Esstern Time (US & Canada) (ID: Eastern Standard Time) look like below -   \ @@ -305,4 +305,3 @@ ICU Time Zone Data We currently generate an ICU resource file from supplemental/windowsZones.xml. The mapping data in the current form (1-to-1 map) is used by ICU4C to detect the default system time zone on Windows platform. This implementation has been there for several releases. When a new Olson time zone data version is published, ICU team ships updated data to ICU users, including the mapping data generated from windowsZones.xml. We want to use the same resource for past ICU releases, we cannot change the current ICU resource format. Therefore, LDML2ICUConverter must filter non-default mappings once windowsZones.xml is updated. For future ICU use, LDML2ICUConverter may generate two tables, one in the current format, another for additional mappings. -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/development/development-process/design-proposals/fractional-plurals.md b/docs/site/development/development-process/design-proposals/fractional-plurals.md index 14d948f359f..56cd37b4804 100644 --- a/docs/site/development/development-process/design-proposals/fractional-plurals.md +++ b/docs/site/development/development-process/design-proposals/fractional-plurals.md @@ -8,4 +8,3 @@ title: Fractional Plurals [Fractional Plurals Design Doc](https://docs.google.com/document/d/155ZJOHtOgnm8P80TDL8QGfNZ-wNoqsRNRGHRfJB4JGs/edit?usp=sharing) -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/development/development-process/design-proposals/generic-calendar-data.md b/docs/site/development/development-process/design-proposals/generic-calendar-data.md index c18ab8ec946..ea1a1bf4f46 100644 --- a/docs/site/development/development-process/design-proposals/generic-calendar-data.md +++ b/docs/site/development/development-process/design-proposals/generic-calendar-data.md @@ -80,7 +80,7 @@ Based on the discussion above, the following changes/suggestions are proposed fo 1. "gregorian" formats should change from "EGyMd Hms" order to "GyMdE Hms" order (and same for formats that use subsets of this). Note that this would be more consistent with the "d E" order already used for the Ed skeleton. 2. "generic" formats should use "EdMyG Hms" order and subsets thereof ("chinese" formats should use "EdMU Hms" etc.). 3. "generic" should provide non-numeric wide/abbreviated weekday names, probably "sunday"/"sun".."saturday"/"sat". -4. If "generic" provides numeric strings for e.g. narrow weekday names, it should probably use "1" for Monday to be consistent with ISO 8601. +4. If "generic" provides numeric strings for e.g. narrow weekday names, it should probably use "1" for Monday to be consistent with ISO 8601. 5. "gregorian" should provide non-numeric wide/abbreviated month names (it inherits the weekday names from "generic"). These could be e.g. "month1".."month12"/"mo1".."mo12" or "january".."december"/"jan".."dec". This will vastly improve legibility of some formats. 6. "generic" should have generic but non-numeric wide/abbreviated month and era names, e.g. "month1"/"mo1".."month12"/"mo12", "era0".."era1". 7. "chinese" should also provide generic but non-numeric month names. @@ -100,4 +100,3 @@ Based on the TC discussion: - [#5421](http://unicode.org/cldr/trac/ticket/5421), Fix era positions - [#5490](http://unicode.org/cldr/trac/ticket/5490), Clean up stock date/time formats -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/development/development-process/design-proposals/grammar-capitalization-forms-for-datetime-elements-and-others.md b/docs/site/development/development-process/design-proposals/grammar-capitalization-forms-for-datetime-elements-and-others.md index b4c564d2284..206766d29a4 100644 --- a/docs/site/development/development-process/design-proposals/grammar-capitalization-forms-for-datetime-elements-and-others.md +++ b/docs/site/development/development-process/design-proposals/grammar-capitalization-forms-for-datetime-elements-and-others.md @@ -112,7 +112,7 @@ If no context-based name transforms are needed, the \ element My initial thought was to include these elements (as many as necessary) inside each relevant name element: \, \, \, etc. As an example for Czech: - \ \ … \ \titlecase-firstword\ \ \ \ + \ \ … \ \titlecase-firstword\ \ \ \ This would involve additions to the DTD everywhere we wanted to add these, which is a bit cumbersome. An initial list of where these should be added: @@ -185,11 +185,10 @@ Table body cells, key for desired casing (upper section of cell for language): L Table body cells, color code: - + Here is the table: - - -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file + + diff --git a/docs/site/development/development-process/design-proposals/grapheme-usage.md b/docs/site/development/development-process/design-proposals/grapheme-usage.md index 4a7a0118cfa..03cc60bff5e 100644 --- a/docs/site/development/development-process/design-proposals/grapheme-usage.md +++ b/docs/site/development/development-process/design-proposals/grapheme-usage.md @@ -66,4 +66,3 @@ And to the new 'function-based' breaks: - #[2406](http://unicode.org/cldr/trac/ticket/2406), Add locale keywords to specify the type (or variant) of word & grapheme break (pedberg, 2.0) - There is also the suggestion to add another type which is beyond the scope of CLDR - a cluster type that treats ligatures as single clusters. This depends on font behavior. -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/development/development-process/design-proposals/hebrew-months.md b/docs/site/development/development-process/design-proposals/hebrew-months.md index 394a9832305..51a15f3a455 100644 --- a/docs/site/development/development-process/design-proposals/hebrew-months.md +++ b/docs/site/development/development-process/design-proposals/hebrew-months.md @@ -12,7 +12,7 @@ Shevat = month 5, Adar = month 6, Nisan = Month 7 while in a leap year: -Shevat = month 5, Adar I = month 6, Adar II = month 7, and Nisan = month 8. +Shevat = month 5, Adar I = month 6, Adar II = month 7, and Nisan = month 8. According to Wikipedia, "Adar II" in a leap year is the "real" Adar, and "Adar I" is considered to be the "extra" month. @@ -90,8 +90,7 @@ a). It is only a one line change from the existing data, which means minimal dis b). It is technically more accurate according to the Wikipedia, since "Adar II" in a leap year is considered the equivalent month as "Adar" in a non-leap year. That is to say, "Adar II" is the "real" Adar, not "Adar I". -c). Calendaring applications have a relatively easy way to go through the data in numeric order. In a non-leap year, just use 1-5 and 7-12. In a leap year, use 1-6, + 7 alt + 8-12. +c). Calendaring applications have a relatively easy way to go through the data in numeric order. In a non-leap year, just use 1-5 and 7-12. In a leap year, use 1-6, + 7 alt + 8-12. The new attribute "yeartype" was chosed as opposed to using "alt", since ICU's build process excludes all "@alt" data by default. -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/development/development-process/design-proposals/index-characters.md b/docs/site/development/development-process/design-proposals/index-characters.md index 50acb476d32..1c92c0abdc6 100644 --- a/docs/site/development/development-process/design-proposals/index-characters.md +++ b/docs/site/development/development-process/design-proposals/index-characters.md @@ -89,7 +89,7 @@ The indexLabel is used to display characters (if it is available). That is, when Note that the indexLabels can be used both with contiguous ranges and non-contiguous ranges. For German we might have [A-S Sch Sci St Su T-Z] as the index characters, and the following labels: - + \S\ @@ -134,4 +134,3 @@ Where multiple character sequences sort the same at a primary level, the automat *WARNING: the automatic generation would only be a draft, for translators to tune, so any shortcomings could be fixed.* -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/development/development-process/design-proposals/islamic-calendar-types.md b/docs/site/development/development-process/design-proposals/islamic-calendar-types.md index 3cf9dfb383e..582008185d8 100644 --- a/docs/site/development/development-process/design-proposals/islamic-calendar-types.md +++ b/docs/site/development/development-process/design-proposals/islamic-calendar-types.md @@ -158,4 +158,3 @@ Note that we may get requests for some other calendar types/variations such as: - Turkish variant of Islamic calendar - Other regional variants of Islamic calendar -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/development/development-process/design-proposals/iso-636-deprecation-requests-draft.md b/docs/site/development/development-process/design-proposals/iso-636-deprecation-requests-draft.md index 8a4a1dfd952..ba16add846d 100644 --- a/docs/site/development/development-process/design-proposals/iso-636-deprecation-requests-draft.md +++ b/docs/site/development/development-process/design-proposals/iso-636-deprecation-requests-draft.md @@ -16,4 +16,3 @@ The current cases in question are listed below. However we need to collate and o | [hbs](http://www.sil.org/iso639-3/documentation.asp?id=hbs) (sh) Serbo-Croatian | [bos](http://www.sil.org/iso639-3/documentation.asp?id=bos) (bs) Bosnian; [hrv](http://www.sil.org/iso639-3/documentation.asp?id=hrv) (hr) Croatian; [srp](http://www.sil.org/iso639-3/documentation.asp?id=srp) (sr) Serbian | These are all mutually comprehensible according to many native speakers. | Ideally, we would deprecate bos, hrv, srp; add the names to 'hbs'; however, there is probably too much installed base to do this. | -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/development/development-process/design-proposals/json-packaging-approved-by-the-cldr-tc-on-2015-03-25.md b/docs/site/development/development-process/design-proposals/json-packaging-approved-by-the-cldr-tc-on-2015-03-25.md index 249716d3c46..0085a33c549 100644 --- a/docs/site/development/development-process/design-proposals/json-packaging-approved-by-the-cldr-tc-on-2015-03-25.md +++ b/docs/site/development/development-process/design-proposals/json-packaging-approved-by-the-cldr-tc-on-2015-03-25.md @@ -64,4 +64,3 @@ Locales by Tier as of CLDR 26 (for reference purposes only) | full | All other locales. | -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/development/development-process/design-proposals/language-data-consistency.md b/docs/site/development/development-process/design-proposals/language-data-consistency.md index 3f064e70548..97cd230e8c7 100644 --- a/docs/site/development/development-process/design-proposals/language-data-consistency.md +++ b/docs/site/development/development-process/design-proposals/language-data-consistency.md @@ -28,4 +28,3 @@ We have a set of tests for consistency in the data for language, script, and cou Likely Subtags are built from the language-country population data, plus the script metadata, plus an exception list. -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/development/development-process/design-proposals/language-distance-data.md b/docs/site/development/development-process/design-proposals/language-distance-data.md index 9db853fce55..979b8b1502a 100644 --- a/docs/site/development/development-process/design-proposals/language-distance-data.md +++ b/docs/site/development/development-process/design-proposals/language-distance-data.md @@ -117,7 +117,7 @@ Note that this doesn't have to be an N x M algorithm. Because there is a minimum The data is designed to be relatively simple to understand. It would typically be processed into an internal format for fast processing. The data does not need to be exact; only the relative computed values are important. However, for keep the types of fields apart, they are given very different values. TODO: add values for [ISO 636 Deprecation Requests - DRAFT](https://cldr.unicode.org/development/development-process/design-proposals/iso-636-deprecation-requests-draft) -\ +\ \ @@ -159,7 +159,7 @@ The data is designed to be relatively simple to understand. It would typically b \ -\8\ \ +\8\ \ \64\ \ @@ -181,7 +181,7 @@ The data is designed to be relatively simple to understand. It would typically b \16\ \ -\ +\ ## Interpreting the Format @@ -196,4 +196,3 @@ Issues - Should we have the values be symbolic rather than literal numbers? eg: L, S, R, ... instead of 1024, 256, 64,... - The "\*" is a bit of a hack. Other thoughts for syntax? -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/development/development-process/design-proposals/list-formatting.md b/docs/site/development/development-process/design-proposals/list-formatting.md index 18ff38972e6..b9e04169d78 100644 --- a/docs/site/development/development-process/design-proposals/list-formatting.md +++ b/docs/site/development/development-process/design-proposals/list-formatting.md @@ -63,4 +63,3 @@ Note that a higher level needs to handle the cases of zero and one element. Typi To account for the issue Philip raises, we might want to have alt values for a semi-colon (like) variant. -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/development/development-process/design-proposals/locale-format.md b/docs/site/development/development-process/design-proposals/locale-format.md index 6bcb30c0ad5..d259e9a8d72 100644 --- a/docs/site/development/development-process/design-proposals/locale-format.md +++ b/docs/site/development/development-process/design-proposals/locale-format.md @@ -79,9 +79,8 @@ If there is no placeholder in the pattern, it works the old way. 4. "short standalone": US English 3. We would also add context="short" on Regions, to get "US", and use it if there wasn't a short form of en\_US context="short" or "short standalone" -Fallbacks: +Fallbacks: - short standalone => standalone => "" - short => "" -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/development/development-process/design-proposals/localized-gmt-format.md b/docs/site/development/development-process/design-proposals/localized-gmt-format.md index a21d71261af..e875e059879 100644 --- a/docs/site/development/development-process/design-proposals/localized-gmt-format.md +++ b/docs/site/development/development-process/design-proposals/localized-gmt-format.md @@ -21,7 +21,7 @@ In CLDR 22, elements used for localized GMT format are below: - \ Format patterns used for representing UTC offset. This item is a single string containing two patterns, one for positive offset and another for negative offset, separated by semicolon (;). For example, "+HH:mm;-HH:mm". Each pattern must contain "H" (0-based 24 hours field) and "m" (minutes field). - \ Message format pattern such as "GMT{0}" used for localized GMT format. The variable part is replaced with UTC offset representation created by \ above. -- \ The string used for UTC (GMT) itself, such as "GMT". The string is used only when UTC offset is 0. +- \ The string used for UTC (GMT) itself, such as "GMT". The string is used only when UTC offset is 0. ### Proposed Changes @@ -97,4 +97,3 @@ There are some locales using relatively long patterns. If long/short distinction Because of another level of abstraction (separator, actual pattern width by context), this proposal may need a little bit more work on CLDR ST. -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/development/development-process/design-proposals/math-formula-preferences.md b/docs/site/development/development-process/design-proposals/math-formula-preferences.md index 967e69b6fca..c724a99955a 100644 --- a/docs/site/development/development-process/design-proposals/math-formula-preferences.md +++ b/docs/site/development/development-process/design-proposals/math-formula-preferences.md @@ -46,4 +46,3 @@ mathFormulaDirection = "left-to-right", while ar.xml would have "right-to-left", Similarly, the vast majority of Arabic speaking locales would simply inherit their "math" numbering system from the default numbering system for the locale, and we would only need to explicitly specify a "math" numbering system where it differs from the default, for example, Yemen, Oman, Iraq. -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/development/development-process/design-proposals/new-bcp47-extension-t-fields.md b/docs/site/development/development-process/design-proposals/new-bcp47-extension-t-fields.md index 8d873185a30..9cda195c88b 100644 --- a/docs/site/development/development-process/design-proposals/new-bcp47-extension-t-fields.md +++ b/docs/site/development/development-process/design-proposals/new-bcp47-extension-t-fields.md @@ -6,7 +6,7 @@ title: New BCP47 Extension T Fields ## Proposed Additions -BCP47 language tags can use Extension T for identifying transformed content, or indicating requests for transformed content, as described in [*rfc6497*](http://tools.ietf.org/html/rfc6497). If you have any comments on proposals, please circulate them on the cldr-users mailing list. Instructions for joining are at [cldr list](http://www.unicode.org/consortium/distlist.html#cldr_list). +BCP47 language tags can use Extension T for identifying transformed content, or indicating requests for transformed content, as described in [*rfc6497*](http://tools.ietf.org/html/rfc6497). If you have any comments on proposals, please circulate them on the cldr-users mailing list. Instructions for joining are at [cldr list](http://www.unicode.org/consortium/distlist.html#cldr_list). *There are no proposed additions at this time.* @@ -180,4 +180,3 @@ The following proposal was distributed for public review on March 26, 2012. The **Note:** RFC6497 interprets transforms that result in content broadly, including speech recognition and other instances where the source is not simply text. For the case of keyboards, the source content can be viewed as keystrokes, but may also be text—for the case of virtual web-based keyboards. For example, such a keyboard may translate the text in the following way. Suppose the user types a key that produces a "W" on a qwerty keyboard. A web-based tool using an azerty virtual keyboard can map that text ("W") to the text that would have resulted from typing a key on an azerty keyboard, by transforming "W" to "Z". Such transforms are in fact performed in existing web applications. The standardized extension can be used to communicate, internally or externally, a request for a particular keyboard mapping that is to be used to transform either text or keystrokes, and then use that data to perform the requested actions. -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/development/development-process/design-proposals/new-time-zone-patterns.md b/docs/site/development/development-process/design-proposals/new-time-zone-patterns.md index 860040ab36f..e91f10553d2 100644 --- a/docs/site/development/development-process/design-proposals/new-time-zone-patterns.md +++ b/docs/site/development/development-process/design-proposals/new-time-zone-patterns.md @@ -17,7 +17,7 @@ This design proposal includes following new pattern letters in the LDML date for 1. **X** and **x** for ISO 8601 style non localizable UTC offset format 1. 'X' uses UTC designator "Z" when UTC offset is 0 - 2. 'x' uses difference between local time and UTC always - i.e. format like "+0000" is used when UTC offset is 0. + 2. 'x' uses difference between local time and UTC always - i.e. format like "+0000" is used when UTC offset is 0. 2. **O** for localized GMT format variations 3. **V** for time zone ID (V - short / VV - IANA) and exemplar city (VVV). @@ -52,7 +52,7 @@ In LDML, we could define "ZZZZZZ" (6 'Z's), "ZZZZZZZ" (7 'Z's)... to support the The JSR-310 definition (compatible with JDK 7 SimpleDateFormat, with some enhancements for seconds field) might be used also for LDML, but I think there are several issues. - Single 'X' is used for limiting offset to be hour field only. Such usage is practically questionable. There are some active time zones using offsets with non-zero minutes field. So such format is highly discouraged when a zone has non-zero minutes field. ISO8601 specification also says "The minutes time element of the difference may only be omitted if the difference between the time scales is exactly an integral number of hours.". -- When non-zero minutes (or seconds) field is truncated and hour field is 0, the output becomes +00/-00/+0000/-0000/+00:00/-00:00. Use of negative sign for offset equivalent to UTC (-00/-0000/-00:00) is illegal in ISO8601. +- When non-zero minutes (or seconds) field is truncated and hour field is 0, the output becomes +00/-00/+0000/-0000/+00:00/-00:00. Use of negative sign for offset equivalent to UTC (-00/-0000/-00:00) is illegal in ISO8601. When to use "Z" or "+00"/"+0000"/"+00:00" is also a design question. JSR-310 seems to extend pattern letter Z to support format without ISO8601 UTC indicator "Z". @@ -136,5 +136,4 @@ In CLDR, we're afraid of burning one letter just for this purpose. In the CLDR T | **V** | 1 | uslax
utc | Short time zone identifier (BCP 47 unicode locale extension, time zone value)

fallback: If there is no mapping to BCP 47 time zone value, format for pattern "xxxx" is used as a fallback, such as "-0500" | | | 2 | America/Los_Angeles
Etc/GMT | Time zone identifier (IANA Time Zone Database, or user defined ID) | | | 3 | Los Angeles
東京 | Localized exemplar location (city) name for time zone

If a time zone is not associated with any specific locations (e.g. Etc/GMT+1), localized exemplar city name for time zone "Etc/Unknown" is used. | - -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file + diff --git a/docs/site/development/development-process/design-proposals/path-filtering.md b/docs/site/development/development-process/design-proposals/path-filtering.md index 4f1492b086a..4561991fc37 100644 --- a/docs/site/development/development-process/design-proposals/path-filtering.md +++ b/docs/site/development/development-process/design-proposals/path-filtering.md @@ -50,9 +50,8 @@ There are a couple of extra features of the regex. For the coverage level (and p | $localeCurrencies | modern currencies for the $localeRegions | | $modernMetazones | metazones ... | -*Issue:* +*Issue:* - *I'm thinking that we may want to append the value to the path (eg .../\_VALUE="...") to allow for matching on that.* - *Use XML instead of ; format?* -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/development/development-process/design-proposals/pattern-character-for-related-year.md b/docs/site/development/development-process/design-proposals/pattern-character-for-related-year.md index acfd5db088d..57c259bf07a 100644 --- a/docs/site/development/development-process/design-proposals/pattern-character-for-related-year.md +++ b/docs/site/development/development-process/design-proposals/pattern-character-for-related-year.md @@ -29,4 +29,3 @@ The data and format requirements for section 2.2 above are more complex and not See also the earlier discussion of these issues in section F.11 of the proposal “[Chinese (and other) calendar support, intercalary months, year cycles](https://cldr.unicode.org/development/development-process/design-proposals/chinese-and-other-calendar-support-intercalary-months-year-cycles).” -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/development/development-process/design-proposals/pinyin-fixes.md b/docs/site/development/development-process/design-proposals/pinyin-fixes.md index c65f37bd5f6..83bde457e0e 100644 --- a/docs/site/development/development-process/design-proposals/pinyin-fixes.md +++ b/docs/site/development/development-process/design-proposals/pinyin-fixes.md @@ -32,4 +32,3 @@ Where there are multiple readings in Unihan, they are given in the format with - [pinyinSortComparison.txt](https://drive.google.com/file/d/1XFMmbjipcf6pTH2VOJ_KOnjSdpkvyLcq/view?usp=sharing) -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/development/development-process/design-proposals/post-mortem.md b/docs/site/development/development-process/design-proposals/post-mortem.md index e5f2dfbfa28..b3838c0b869 100644 --- a/docs/site/development/development-process/design-proposals/post-mortem.md +++ b/docs/site/development/development-process/design-proposals/post-mortem.md @@ -56,4 +56,3 @@ Drivers marked in [...]. Drivers are to file bugs, put together plan for how to 7. Leverage QuickSteps in rest of survey tool; allow Example column in QS, etc. **[Steven, John]** 8. Vetters used to better tools, able to group. (sort/group/filter) **[Steven, John]** -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/development/development-process/design-proposals/proposed-collation-additions.md b/docs/site/development/development-process/design-proposals/proposed-collation-additions.md index d381857f53b..2e08ae61061 100644 --- a/docs/site/development/development-process/design-proposals/proposed-collation-additions.md +++ b/docs/site/development/development-process/design-proposals/proposed-collation-additions.md @@ -85,4 +85,3 @@ This attribute indicates to clients that the collation is intended only for \l in locales tha (We could use other names for the alt form such as "secular" or "neutral" but "variant" is more general and already widely supported.) -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/development/development-process/design-proposals/specifying-text-break-variants-in-locale-ids.md b/docs/site/development/development-process/design-proposals/specifying-text-break-variants-in-locale-ids.md index 193e3414cd5..2b2afe95fcf 100644 --- a/docs/site/development/development-process/design-proposals/specifying-text-break-variants-in-locale-ids.md +++ b/docs/site/development/development-process/design-proposals/specifying-text-break-variants-in-locale-ids.md @@ -51,7 +51,7 @@ We need a locale keyword to control use of ULI suppressions data (i.e. to determ Currently ICU uses dictionary-based break for text in SE Asian scripts only. The two most important needs for line break control are: - For Japanese text, control whether line breaks are allowed before small kana and before the prolonged sound mark 30FC; this corresponds to (most of) the distinction between CSS level 3 strict and normal line break (see below), and is implemented by treating LineBreak property value CJ as either NS (strict) or ID (normal). -- For Korean text, control whether the line break style is E. Asian style (breaks can occur in the middle of words) or “Western” style (breaks are space based), as described in UAX 14. +- For Korean text, control whether the line break style is E. Asian style (breaks can occur in the middle of words) or “Western” style (breaks are space based), as described in UAX 14. Other desirable capabilities include: @@ -504,7 +504,7 @@ boundaries{ } ``` -BreakIterator::buildInstance is called by BreakIterator::makeInstance, which provides the type keys "grapheme", "line", etc. It could use the locale to construct the resource keys with extensions. +BreakIterator::buildInstance is called by BreakIterator::makeInstance, which provides the type keys "grapheme", "line", etc. It could use the locale to construct the resource keys with extensions. ### D. Current dictionary break implementation @@ -533,4 +533,3 @@ It would be nice for a given locale to be able to specify, for each break type, Thanks to Koji Ishii and the CLDR team for feedback on this document. -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/development/development-process/design-proposals/suggested-exemplar-revisions.md b/docs/site/development/development-process/design-proposals/suggested-exemplar-revisions.md index 78e06c08844..47137e3d89d 100644 --- a/docs/site/development/development-process/design-proposals/suggested-exemplar-revisions.md +++ b/docs/site/development/development-process/design-proposals/suggested-exemplar-revisions.md @@ -66,34 +66,34 @@ Here are my suggestions. Please send feedback to [mark@macchiato.com](mailto:mar From Bug 1947, for reference. -The exemplar character set for ja appears to be too small. +The exemplar character set for ja appears to be too small. -1. It contains about 2,000 characters (Kanji, Hiragana and Katakana). +1. It contains about 2,000 characters (Kanji, Hiragana and Katakana). 2. If Exemplar Character set is limited to the most widely used one (Level 1 Kanji? in JIS X 208), I expected Auxiliary Exemplar Character set to contain the -rest of +rest of -JIS X 0208 (plus JIS X 212 / 213). However, it contains only 5 characters. +JIS X 0208 (plus JIS X 212 / 213). However, it contains only 5 characters. 3. It does not contain \ ('composed Katakana letters'), U+30FB and U+30FC (conjunction and length marks). -For instance, characters like U+4EDD,U+66D9, U+7DBE are not included although they're used in Japanese IDN names (which is an indicator that they're pretty widely used. See ) +For instance, characters like U+4EDD,U+66D9, U+7DBE are not included although they're used in Japanese IDN names (which is an indicator that they're pretty widely used. See ) While I was at it, I also looked at zh\* and ko. All of them have about 2000 characters (in case of 2350 which is the number of Hangul syllables in KS X 1001). The auxiliary sets for zh\* have only tens of characters (26 for zh\_Hans -and 33 for zh\_Hant). +and 33 for zh\_Hant). -It's rather inconvenient to type hundreds (if not thousands) of characters in the CLDR survey tool. Perhaps, we have to fill in those values ('candidate sets' for vetting) using cvs before the next round of CLDR survey. +It's rather inconvenient to type hundreds (if not thousands) of characters in the CLDR survey tool. Perhaps, we have to fill in those values ('candidate sets' for vetting) using cvs before the next round of CLDR survey. ... Jungshik and I discussed this, and there are three possible sources (for each of Chinese (S+T), Japanese, and Korean) that we could tie the exemplars to: -1. charsets (in the case of Japanese, this would be probably: JIS 208 + 212 + 213. (This would be a large set, and +1. charsets (in the case of Japanese, this would be probably: JIS 208 + 212 + 213. (This would be a large set, and contain many rarely-used characters).

1a. Only use JIS 208. (The current approach appears to be JIS 208, but only level 1.) -2. Use the educational standards in each country/territory for primary+secondary requirements. We'd have to +2. Use the educational standards in each country/territory for primary+secondary requirements. We'd have to look up sources for these. 3. Use the NIC restrictions for each country. @@ -104,4 +104,3 @@ There is a fourth possibility: Use the characters that are supported by the comm platforms for these languages (e.g. the characters that are in the cmaps for [TrueType?](#BAD_URL) fonts). -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/development/development-process/design-proposals/supported-numberingsystems.md b/docs/site/development/development-process/design-proposals/supported-numberingsystems.md index 6dbd6cc9d06..38399fa37db 100644 --- a/docs/site/development/development-process/design-proposals/supported-numberingsystems.md +++ b/docs/site/development/development-process/design-proposals/supported-numberingsystems.md @@ -18,7 +18,7 @@ This proposal replaces the current "defaultNumberingSystem" field with a series \ - Numbering system using native digits. The "native" numbering system can only be a numeric numbering system, containing the native digits used in the locale. -\ - The traditional or historic numbering system. Algorithmic systems are allowed in the "traditional" system. +\ - The traditional or historic numbering system. Algorithmic systems are allowed in the "traditional" system. - May be the same as "native" for some locales, but it may be different for others, such as Tamil or Chinese. - If "traditional" is not explicitly specified, fall back to "native". @@ -757,4 +757,3 @@ Proposed seed data for numbering systems The plan is that these fields would NOT be exposed to survey tool, and would only be changeable via ticket submissions in trac. -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/development/development-process/design-proposals/thoughts-on-survey-tool-backend.md b/docs/site/development/development-process/design-proposals/thoughts-on-survey-tool-backend.md index 1cf8b5972d0..4d628adf1da 100644 --- a/docs/site/development/development-process/design-proposals/thoughts-on-survey-tool-backend.md +++ b/docs/site/development/development-process/design-proposals/thoughts-on-survey-tool-backend.md @@ -38,10 +38,10 @@ pathId → valueInfo+ *// ordered by voteCount then UCA (so first is winning, second is ‘next best’)*  valueInfo = value, isInherited, coverageLevel, voteCount, voter*, errorStatus*, example? - +  *// that is, a value like “Sontag”, whether the value is inherited, what the coverage level is (computed algorithmically), what the voteCount is (computed from the voters: computed and cached), the errorStatus (computed and cached), and the example text (computed and cached). Maybe add dependentPaths* (see below).* - errorStatus = error/warningID, message + errorStatus = error/warningID, message value → pathId* @@ -80,4 +80,3 @@ Issue: the Voter map changes occasionally. For new users, we don’t have to do Issue: With multiple machines (or app engine) we could shard the processing; divide up the locales by base language, and divy them out to different machines. (Clumps would have to be slightly larger where we have sibling aliases.) -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/development/development-process/design-proposals/time-zone-data-reorganization.md b/docs/site/development/development-process/design-proposals/time-zone-data-reorganization.md index 2a232de9ebe..c049acf4c3f 100644 --- a/docs/site/development/development-process/design-proposals/time-zone-data-reorganization.md +++ b/docs/site/development/development-process/design-proposals/time-zone-data-reorganization.md @@ -154,4 +154,3 @@ Add b) into this file Store only g). -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/development/development-process/design-proposals/transform-fallback.md b/docs/site/development/development-process/design-proposals/transform-fallback.md index 4d3e1d06f26..b63da4bcf99 100644 --- a/docs/site/development/development-process/design-proposals/transform-fallback.md +++ b/docs/site/development/development-process/design-proposals/transform-fallback.md @@ -77,6 +77,5 @@ We have the implicit requirement that no variant is populated unless there is a Case 1. only fa-Latn/BGN. Add an alias from fa-Latn to fa-Latn/BGN -Case 2. only foo-Latn. Rename to foo-Latn/SOMETHING, and then do Case 1. +Case 2. only foo-Latn. Rename to foo-Latn/SOMETHING, and then do Case 1. -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/development/development-process/design-proposals/transform-keywords.md b/docs/site/development/development-process/design-proposals/transform-keywords.md index 98683591134..4d9b370fccc 100644 --- a/docs/site/development/development-process/design-proposals/transform-keywords.md +++ b/docs/site/development/development-process/design-proposals/transform-keywords.md @@ -30,4 +30,3 @@ The Unicode extension key **tm** is a keyword specifying the mechanism for the t Any final subtype of 4, 6, or 8 digits represents a date in the format yyyy(MM(dd)?)?, such as 2010, or 201009, or 20100930. So, for example, und-Latn-t-und-hebr-tm-ungegn-2007 represents the transliteration as described in http://www.eki.ee/wgrs/rom1\_he.htm. The date should only be used where necessary, and if present only be as specific as necessary. So if the only dated variants for the given mechanism, source, and result are 1977 and 2007, the month and day in 2007 should not be present. -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/development/development-process/design-proposals/unihan-data.md b/docs/site/development/development-process/design-proposals/unihan-data.md index 2bf39a7ed12..ea728525e91 100644 --- a/docs/site/development/development-process/design-proposals/unihan-data.md +++ b/docs/site/development/development-process/design-proposals/unihan-data.md @@ -59,7 +59,7 @@ As a proxy for the best pinyin, we use an algorithm to pick from the many pinyin Take the first pinyin from the following. Where there are multiple choices in a field, use the first 1. patchFile -2. kMandarin // moved up in CLDR 30. +2. kMandarin // moved up in CLDR 30. 3. kHanyuPinlu 4. kXHC1983 5. kHanyuPinyin @@ -104,4 +104,3 @@ Then, if it is still missing, try to map to a character that does have a pinyin. 3. ~~Using the name reading rather than the general reading for standard pinyin collation might produce unexpected results.~~ 4. ~~Why not just specify the name reading when that is desired? No need to make it the default if it is the less common reading.~~ -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/development/development-process/design-proposals/units-pixels-ems-display-resolution.md b/docs/site/development/development-process/design-proposals/units-pixels-ems-display-resolution.md index 7cc748ced63..4322b7da508 100644 --- a/docs/site/development/development-process/design-proposals/units-pixels-ems-display-resolution.md +++ b/docs/site/development/development-process/design-proposals/units-pixels-ems-display-resolution.md @@ -37,4 +37,4 @@ Some reference material: - https://en.wikipedia.org/wiki/Pixel_density - https://en.wikipedia.org/wiki/Em_(typography) -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) + diff --git a/docs/site/development/development-process/design-proposals/uts-35-splitting.md b/docs/site/development/development-process/design-proposals/uts-35-splitting.md index 795ed14e63b..ffcfa354d64 100644 --- a/docs/site/development/development-process/design-proposals/uts-35-splitting.md +++ b/docs/site/development/development-process/design-proposals/uts-35-splitting.md @@ -47,4 +47,3 @@ Important features 1. Could only approximate the TR format. 2. CSS doesn't yet work. -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/development/development-process/design-proposals/voting.md b/docs/site/development/development-process/design-proposals/voting.md index 59e95ffa3ea..a23fdd5309c 100644 --- a/docs/site/development/development-process/design-proposals/voting.md +++ b/docs/site/development/development-process/design-proposals/voting.md @@ -47,6 +47,5 @@ The key points are: - If the draft status of the previously released value is better than the new draft status, then no change is made. Otherwise, the optimal value and its draft status are made part of the new release. -In our previous version, *approved* required O ≥ 8. +In our previous version, *approved* required O ≥ 8. -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/development/development-process/design-proposals/xmb.md b/docs/site/development/development-process/design-proposals/xmb.md index 886b8d9455a..6d411e4c1c7 100644 --- a/docs/site/development/development-process/design-proposals/xmb.md +++ b/docs/site/development/development-process/design-proposals/xmb.md @@ -171,4 +171,4 @@ other {# weeks}}}} - Figure out how to do the differences between HH and hh, etc. - Current thoughts: don't let the translator choose, but make it part of the xtb-cldr processing. -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) + diff --git a/docs/site/development/guidance-on-direct-modifications-to-cldr-data.md b/docs/site/development/guidance-on-direct-modifications-to-cldr-data.md index 7c2fa8670fc..ad8176e9828 100644 --- a/docs/site/development/guidance-on-direct-modifications-to-cldr-data.md +++ b/docs/site/development/guidance-on-direct-modifications-to-cldr-data.md @@ -50,4 +50,3 @@ So the following is ok, but would be better if one of the attribute values were \ -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/development/maven.md b/docs/site/development/maven.md index 3cc5d8a135d..8b1ee268d59 100644 --- a/docs/site/development/maven.md +++ b/docs/site/development/maven.md @@ -88,7 +88,7 @@ This will run all tests and create the all\-in\-one **tools/cldr\-code/target/cl Example to run only one test from the main unit tests and one test in the web tests: (one long command line, two separate parameters) ``` -mvn test '-Dorg.unicode.cldr.unittest.testArgs=-f:TestUntimedCounter -n -q' +mvn test '-Dorg.unicode.cldr.unittest.testArgs=-f:TestUntimedCounter -n -q' '-Dorg.unicode.cldr.unittest.web.testArgs=-f:TestMisc -n -q' ``` @@ -125,4 +125,3 @@ mvn -DCLDR_DIR=$HOME/src/cldr exec:java -pl cldr-code -Dexec.mainClass=org.unico 1. To start up the Survey Tool, right\-click on 'cldr\-apps' and choose 'Run As… Run On Server'. Create a Tomcat 9 server. -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/development/new-cldr-developers.md b/docs/site/development/new-cldr-developers.md index b60e6eb3d22..17054442652 100644 --- a/docs/site/development/new-cldr-developers.md +++ b/docs/site/development/new-cldr-developers.md @@ -22,7 +22,7 @@ Next, get your Eclipse environment set up properly. 1. http://cldr.unicode.org/development/eclipse-setup 2. http://cldr.unicode.org/development/running-survey-tool/eclipse - + **Run the CLDR tests to be sure they pass before beginning work**: @@ -38,7 +38,7 @@ Command line: 8. The -v tells test script to show stack trace at the test failure for debugging. 9. To get all parameters that could be passed at runcheck.arg, run 10. **ant -Druncheck.arg="-?" check** - + Via eclipse: @@ -60,4 +60,3 @@ Other useful pages are under [CLDR Development Site](https://cldr.unicode.org/de [UTS #35: Unicode Locale Data Markup Language (LDML)](https://www.unicode.org/reports/tr35/) is the specification of the XML format used for CLDR data, including the interpretation of the CLDR data. -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/development/running-tests.md b/docs/site/development/running-tests.md index 42bd0b31e9b..ee445cd631c 100644 --- a/docs/site/development/running-tests.md +++ b/docs/site/development/running-tests.md @@ -8,7 +8,7 @@ You will always need to run tests when you do a check\-in. 1. Preconditions - If you change the DTD, be sure to read and follow [Updating DTDs](https://cldr.unicode.org/development/updating-dtds) first. - - If you added a new feature or fixed a significant bug, add a unit test for it. + - If you added a new feature or fixed a significant bug, add a unit test for it. - See unittest/NumberingSystemsTest as an example. - Remember to add to unittest/TestAll 2. Run **TestAll \-e** @@ -33,7 +33,7 @@ $ cd $CLDR_DIR/tools/java && ant all $ cd $CLDR_DIR/tools/cldr-unittest && ant unittestExhaustive datacheck ``` -\[TODO: add more commands here; can't we automate all this into a single build rule for ant?] TODO: [TODOL ticket:8864](http://unicode.org/cldr/trac/ticket/8864) +\[TODO: add more commands here; can't we automate all this into a single build rule for ant?] TODO: [TODOL ticket:8864](http://unicode.org/cldr/trac/ticket/8864) ## Debugging @@ -46,4 +46,3 @@ We use a lot of regexes! 1. There is org.unicode.cldr.util.RegexUtilities.showMismatch (and related methods) that are really useful in debugging cases where regexes fail. You hand it a pattern or matcher and a string, and it shows how far the regex got before it failed. 2. To debug RegexLookup, there is a special call you can make where you pass in a set. On return, that set is filled with a set of strings showing how far each of the regex patterns progressed. You can thus see why a string didn't match as expected. -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/development/running-tools.md b/docs/site/development/running-tools.md index eb41911370a..f8e3c85abda 100644 --- a/docs/site/development/running-tools.md +++ b/docs/site/development/running-tools.md @@ -26,4 +26,3 @@ For the purposes of this document, / and \\ are equivalent. Note: Directories mu | Generating Statistics | CountItems | Generate something like:
 Total Items 66,319
 Total Resolved Items 1,025,077
 Unique Paths 4,717
 Unique Values 45,226
 Unique Full Paths 9,301 | -Dmethod=countItems
-DSOURCE={cldrdata}\cldr_1_4\main

-Dmethod=countItems -DSOURCE={cldrdata}\incoming\vetted\main | | Build most charts | ShowLanguages | | TBD | -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/development/updating-codes.md b/docs/site/development/updating-codes.md index 7e35fca196c..980f54f0b01 100644 --- a/docs/site/development/updating-codes.md +++ b/docs/site/development/updating-codes.md @@ -10,25 +10,24 @@ title: Updating Codes First read [Running Tools](https://cldr.unicode.org/development/running-tools) -1. Update [Script Metadata](https://cldr.unicode.org/development/updating-codes/updating-script-metadata) -2. [Update Population/GDP/Literacy](https://cldr.unicode.org/development/updating-codes/updating-population-gdp-literacy) -3. [Update Language/Script/Region Subtags](https://cldr.unicode.org/development/updating-codes/update-languagescriptregion-subtags) -4. [Update Subdivision Codes](https://cldr.unicode.org/development/updating-codes/updating-subdivision-codes) -5. [Update Subdivision translations](https://cldr.unicode.org/development/updating-codes/updating-subdivision-translations) (new) -6. [Update Currency Codes](https://cldr.unicode.org/development/updating-codes/update-currency-codes) -7. [Update Time Zone Data for ZoneParser](https://cldr.unicode.org/development/updating-codes/update-time-zone-data-for-zoneparser) -8. [Update Validity XML](https://cldr.unicode.org/development/updating-codes/update-validity-xml) - 1. [Update Language/Script/Country Information](https://cldr.unicode.org/development/updating-codes/update-language-script-info) - 2. [LikelySubtags and Default Content](https://cldr.unicode.org/development/updating-codes/likelysubtags-and-default-content) - 3. Update IANA/FIPS Mappings - 1. TBD - Describe what to do. The URLs are - 2. http://www.iana.org/domain-names.htm - 3. http://www.iana.org/root-whois/index.html - 4. http://data.iana.org/TLD/tlds-alpha-by-domain.txt -9. Reformat plurals/ordinals.xml with GeneratedPluralRules.java. Review carefully before checking in. +1. Update [Script Metadata](https://cldr.unicode.org/development/updating-codes/updating-script-metadata) +2. [Update Population/GDP/Literacy](https://cldr.unicode.org/development/updating-codes/updating-population-gdp-literacy) +3. [Update Language/Script/Region Subtags](https://cldr.unicode.org/development/updating-codes/update-languagescriptregion-subtags) +4. [Update Subdivision Codes](https://cldr.unicode.org/development/updating-codes/updating-subdivision-codes) +5. [Update Subdivision translations](https://cldr.unicode.org/development/updating-codes/updating-subdivision-translations) (new) +6. [Update Currency Codes](https://cldr.unicode.org/development/updating-codes/update-currency-codes) +7. [Update Time Zone Data for ZoneParser](https://cldr.unicode.org/development/updating-codes/update-time-zone-data-for-zoneparser) +8. [Update Validity XML](https://cldr.unicode.org/development/updating-codes/update-validity-xml) + 1. [Update Language/Script/Country Information](https://cldr.unicode.org/development/updating-codes/update-language-script-info) + 2. [LikelySubtags and Default Content](https://cldr.unicode.org/development/updating-codes/likelysubtags-and-default-content) + 3. Update IANA/FIPS Mappings + 1. TBD - Describe what to do. The URLs are + 2. http://www.iana.org/domain-names.htm + 3. http://www.iana.org/root-whois/index.html + 4. http://data.iana.org/TLD/tlds-alpha-by-domain.txt +9. Reformat plurals/ordinals.xml with GeneratedPluralRules.java. Review carefully before checking in. 1. Regenerate Supplemental Charts: [Generating Charts](https://cldr.unicode.org/development/cldr-big-red-switch/generating-charts) - + For information about **Version Info** and external metadata, see [Updating External Metadata](https://cldr.unicode.org/development/updating-codes/external-version-metadata) -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/development/updating-codes/external-version-metadata.md b/docs/site/development/updating-codes/external-version-metadata.md index b0ab87557d9..43c6ac97145 100644 --- a/docs/site/development/updating-codes/external-version-metadata.md +++ b/docs/site/development/updating-codes/external-version-metadata.md @@ -6,11 +6,11 @@ title: Updating External Version Metadata ## Updating Metadata -[CLDR\-15005](https://unicode-org.atlassian.net/browse/CLDR-15005) is for updating the process for external metadata versions. The following table is out of date with [common/properties/external\_data\_versions.tsv](https://github.com/unicode-org/cldr/blob/main/common/properties/external_data_versions.tsv) +[CLDR\-15005](https://unicode-org.atlassian.net/browse/CLDR-15005) is for updating the process for external metadata versions. The following table is out of date with [common/properties/external\_data\_versions.tsv](https://github.com/unicode-org/cldr/blob/main/common/properties/external_data_versions.tsv) ### TODO: Need to add instructions for updating external metadata -~~The following tells how to get the version info for imported data used in a CLDR release.~~ +~~The following tells how to get the version info for imported data used in a CLDR release.~~ | Data | File | Version Info | Date | |---|---|---|---| @@ -25,5 +25,4 @@ title: Updating External Version Metadata | Top level domains | [tlds-alpha-by-domain.txt](https://github.com/unicode-org/cldr/blob/master/tools/java/org/unicode/cldr/util/data/tlds-alpha-by-domain.txt) | Date at top | 2021-02-17 | | Language Groups | TBD | Record when downloaded in TBD | | | UN / EU Codes | TBD | Record when downloaded in TBD | | - -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file + diff --git a/docs/site/development/updating-codes/likelysubtags-and-default-content.md b/docs/site/development/updating-codes/likelysubtags-and-default-content.md index 16da0776523..93179fdc23d 100644 --- a/docs/site/development/updating-codes/likelysubtags-and-default-content.md +++ b/docs/site/development/updating-codes/likelysubtags-and-default-content.md @@ -21,4 +21,3 @@ title: LikelySubtags and Default Content 4. Run tests, fix input data, and iterate as necessary. 1. Copy into the svn workspace and commit. -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/development/updating-codes/update-currency-codes.md b/docs/site/development/updating-codes/update-currency-codes.md index 00113ec744e..880a433e895 100644 --- a/docs/site/development/updating-codes/update-currency-codes.md +++ b/docs/site/development/updating-codes/update-currency-codes.md @@ -59,4 +59,3 @@ title: Update Currency Codes - common/supplemental/supplementalData.xml - ***Note: We no longer maintain the list of currency in supplementalMetadata.xml (***[***\#4298***](http://unicode.org/cldr/trac/ticket/4298)***). The list is currently maintained by bcp47/currency.xml. We need to move the code used for checking list of ISO currency (and its numeric code mapping) currently in ICU tools repository (http://source.icu-project.org/repos/icu/tools/trunk/currency/).*** -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/development/updating-codes/update-language-script-info.md b/docs/site/development/updating-codes/update-language-script-info.md index fd663ee9a73..0035354305f 100644 --- a/docs/site/development/updating-codes/update-language-script-info.md +++ b/docs/site/development/updating-codes/update-language-script-info.md @@ -38,4 +38,3 @@ title: Update Language Script Info 4. For the EZ, do the same with , into util/data/external/ez\_member\_states\_raw.txt  **BROKEN LINK** 1. If there are changes, update \ -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/development/updating-codes/update-language-script-info/language-script-description.md b/docs/site/development/updating-codes/update-language-script-info/language-script-description.md index 777ff0abc60..4662c1a22a4 100644 --- a/docs/site/development/updating-codes/update-language-script-info/language-script-description.md +++ b/docs/site/development/updating-codes/update-language-script-info/language-script-description.md @@ -20,4 +20,3 @@ Files in https://github.com/unicode-org/cldr/tree/main/tools/cldr-code/src/main/ 1. country\_language\_population.tsv 2. language\_script.tsv -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/development/updating-codes/update-languagescriptregion-subtags.md b/docs/site/development/updating-codes/update-languagescriptregion-subtags.md index 54dfd70181d..bbb662f8440 100644 --- a/docs/site/development/updating-codes/update-languagescriptregion-subtags.md +++ b/docs/site/development/updating-codes/update-languagescriptregion-subtags.md @@ -79,4 +79,4 @@ title: Update Language/Script/Region Subtags - You may also have to fix the coverageLevels.txt file for an error like: - Error: (TestCoverageLevel.java:604\) Comprehensive \& no exception for path \=\> //ldml/localeDisplayNames/territories/territory\[@type\="202"] -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) + diff --git a/docs/site/development/updating-codes/update-time-zone-data-for-zoneparser.md b/docs/site/development/updating-codes/update-time-zone-data-for-zoneparser.md index f1e7c6381c2..eeec6c293c6 100644 --- a/docs/site/development/updating-codes/update-time-zone-data-for-zoneparser.md +++ b/docs/site/development/updating-codes/update-time-zone-data-for-zoneparser.md @@ -32,4 +32,3 @@ Note: This is usually done as a part of full time zone data update process. - This file contains just one line text specifying a version of Time Zone Database, e.g. 2021a. 5. **Record the version: See** [**Updating External Metadata**](https://cldr.unicode.org/development/updating-codes/external-version-metadata) -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/development/updating-codes/update-validity-xml.md b/docs/site/development/updating-codes/update-validity-xml.md index 39eecb1e2ff..7adb92e968c 100644 --- a/docs/site/development/updating-codes/update-validity-xml.md +++ b/docs/site/development/updating-codes/update-validity-xml.md @@ -19,5 +19,4 @@ title: Update Validity XML 5. Run the following (you must have all the archived versions loaded, back to cldr\-28\.0!) 1. TestValidity \-e9 6. If they are ok, replace and checkin - -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file + diff --git a/docs/site/development/updating-codes/updating-population-gdp-literacy.md b/docs/site/development/updating-codes/updating-population-gdp-literacy.md index fe5c0d674f3..b3f231f0ab0 100644 --- a/docs/site/development/updating-codes/updating-population-gdp-literacy.md +++ b/docs/site/development/updating-codes/updating-population-gdp-literacy.md @@ -34,7 +34,7 @@ Once you are there, generate a file by using the following steps. There are 3 co - Select "CSV" - Instruct your browser to the save the file. - You will receive a ZIP file named "**Data\_Extract\_From\_World\_Development\_Indicators.zip**". - - Unpack this zip file. It will contain two files. + - Unpack this zip file. It will contain two files. - (From a unix command line, you can unpack it with - "unzip \-j \-a \-a **Data\_Extract\_From\_World\_Development\_Indicators.zip"** - to junk subdirectories and force the file to LF line endings.) @@ -78,7 +78,7 @@ Once you are there, generate a file by using the following steps. There are 3 co 1. All files are saved in **cldr/tools/java/org/unicode/cldr/util/data/external/** 2. Goto: https://www.cia.gov/library/publications/the-world-factbook/index.html 3. Goto the "References" tab, and click on "Guide to Country Comparisons" -4. Expand "People and Society" and click on "Population" \- +4. Expand "People and Society" and click on "Population" \- 1. There's a "download" icon in the right side of the header. Right click it, Save Link As... call it 2. **factbook\_population.txt** 3. **You may need to delete header lines. The first line should begin with "1 China … " or similar.** @@ -105,4 +105,3 @@ Once you are there, generate a file by using the following steps. There are 3 co 5. Once everything looks ok, check everything in to git. 6. Once done, then run the ConvertLanguageData tool as on [Update Language Script Info](https://cldr.unicode.org/development/updating-codes/update-language-script-info) -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/development/updating-codes/updating-script-metadata.md b/docs/site/development/updating-codes/updating-script-metadata.md index 7ab8a0342e0..7ba0803e477 100644 --- a/docs/site/development/updating-codes/updating-script-metadata.md +++ b/docs/site/development/updating-codes/updating-script-metadata.md @@ -82,4 +82,3 @@ For example, Problems are typically because a non\-standard name is used for a territory name. That can be fixed and the process rerun. -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/development/updating-codes/updating-subdivision-codes.md b/docs/site/development/updating-codes/updating-subdivision-codes.md index 0b8c1455939..efb5439366d 100644 --- a/docs/site/development/updating-codes/updating-subdivision-codes.md +++ b/docs/site/development/updating-codes/updating-subdivision-codes.md @@ -126,4 +126,3 @@ Rearrange the leftovers to see if there is any OLD \=\> NEW1\+NEW2\... cases or | \ | FR | "P" | | "14 50 61" | \ | \ | | \ | FR | | "NOR" | "14 27 50 61 76" | | | -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/development/updating-codes/updating-subdivision-translations.md b/docs/site/development/updating-codes/updating-subdivision-translations.md index 96cc8fdf813..8b85acfd07d 100644 --- a/docs/site/development/updating-codes/updating-subdivision-translations.md +++ b/docs/site/development/updating-codes/updating-subdivision-translations.md @@ -23,4 +23,3 @@ title: Updating Subdivision Translations 2. Check in 1. Make sure you also check in **{workspace}/cldr/tools/cldr\-rdf/external/\*.tsv** ( intermediate tables, for tracking) -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/development/updating-codes/updating-un-codes.md b/docs/site/development/updating-codes/updating-un-codes.md index a2aec71a012..9e790afc139 100644 --- a/docs/site/development/updating-codes/updating-un-codes.md +++ b/docs/site/development/updating-codes/updating-un-codes.md @@ -27,4 +27,3 @@ title: Updating UN Codes ### Run TestUnContainment 1. ```mvn -Dorg.unicode.cldr.unittest.testArgs='-n -q -filter:TestUnContainment' --file=tools/pom.xml -pl cldr-code test -Dtest=TestShim``` -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/development/updating-dtds.md b/docs/site/development/updating-dtds.md index 3bdf15df249..bc21f907ee1 100644 --- a/docs/site/development/updating-dtds.md +++ b/docs/site/development/updating-dtds.md @@ -22,60 +22,60 @@ If you are only adding new alt values, it is much easier. You still need to chan We augment the DTD structure in various ways. -1. Annotations, included below the !ELEMENT or !ATTLIST line - - \ to indicate that an attribute is not distinguishing, and is treated like an element value. - - \ to indicate that an attribute is a "comment" on the data, like the draft status. - - \ to indicate that an element's children are ordered. - - \ to indicate that an attribute or element is deprecated. - - \ to indicate that an attribute value is deprecated. -2. attributeValueValidity.xml - - For additional validity checks -3. Check\* tests and unit tests +1. Annotations, included below the !ELEMENT or !ATTLIST line + - \ to indicate that an attribute is not distinguishing, and is treated like an element value. + - \ to indicate that an attribute is a "comment" on the data, like the draft status. + - \ to indicate that an element's children are ordered. + - \ to indicate that an attribute or element is deprecated. + - \ to indicate that an attribute value is deprecated. +2. attributeValueValidity.xml + - For additional validity checks +3. Check\* tests and unit tests - There are many consistency tests that are performed on the data that can't be expressed with the above. - + ### Removing Structure -1. We never explicitly remove structure except in very unusual cases, so be sure that the committee is in full agreement before doing that. -2. Normally, we just deprecate it, by adding attributes in the DTD file - 1. \ below an !ELEMENT or !ATTLIST item +1. We never explicitly remove structure except in very unusual cases, so be sure that the committee is in full agreement before doing that. +2. Normally, we just deprecate it, by adding attributes in the DTD file + 1. \ below an !ELEMENT or !ATTLIST item 2. \ for specific attribute values - + ### Adding structure (elements, attributes, attribute-values) 1. For each element - 1. add @ORDERED if it is must be ordered. + 1. add @ORDERED if it is must be ordered. 2. read more details below. 2. For each attribute - 1. add @VALUE or @METADATA to an !ATTLIST if the attribute is non-distinguishing. (See the spec for what this means) - 1. **@VALUE should never occur except on leaf nodes!** (There are some cases before we realized this was a mistake.) + 1. add @VALUE or @METADATA to an !ATTLIST if the attribute is non-distinguishing. (See the spec for what this means) + 1. **@VALUE should never occur except on leaf nodes!** (There are some cases before we realized this was a mistake.) 2. If the attribute values are a closed set, you can add them explicitly, like: - \ 3. Otherwise 1. Make it NMTOKEN where only single values are allowed, or NMTOKENS otherwise (CDATA in rare cases, but clear with the committee first) - 2. Add validity information to attributeValueValidity.xml + 2. Add validity information to attributeValueValidity.xml 3. **Never introduce any default DTD attribute values.** (There are some cases before we realized this was a mistake.) 4. For each attribute - 1. add @VALUE or @METADATA to an !ATTLIST if the attribute is non-distinguishing. (See the spec for what this means) + 1. add @VALUE or @METADATA to an !ATTLIST if the attribute is non-distinguishing. (See the spec for what this means) 2. add @ORDERED to an !ELEMENT. - + Add the annotations. ### ldml.dtd 1. **Attribute Value.** - Certain values have special sorting behavior. These are listed in **CLDRFile.getAttributeValueComparator**. They look like:: - - attribute.equals("day") - - || attribute.equals("type") && - - element.endsWith("FormatLength") - - || element.endsWith("Width") - - ... + - attribute.equals("day") + - || attribute.equals("type") && + - element.endsWith("FormatLength") + - || element.endsWith("Width") + - ... - Those need to be updated, or an exception will be thrown when the items are processed. *Note that this is different than the sort order used in PathHeader for the survey tool.* - - To fix them, look at the code and find the right comparator, then modify. Example: - - widthOrder = (MapComparator) new MapComparator().add(new String\[\] {"abbreviated", "narrow", "short", "wide"}).freeze(); + - To fix them, look at the code and find the right comparator, then modify. Example: + - widthOrder = (MapComparator) new MapComparator().add(new String\[\] {"abbreviated", "narrow", "short", "wide"}).freeze(); 2. **Survey Tool Data.** Add information so that the Survey Tool can display these properly to translators - 1. PathHeader.txt (tools/java/org/unicode/cldr/util/data/) - provides the information for what section of the Survey Tool this item shows up in, and how it sorts. - 1. Edit as described in [PathHeader](https://cldr.unicode.org/development/updating-dtds). + 1. PathHeader.txt (tools/java/org/unicode/cldr/util/data/) - provides the information for what section of the Survey Tool this item shows up in, and how it sorts. + 1. Edit as described in [PathHeader](https://cldr.unicode.org/development/updating-dtds). 2. PathDescription.txt (tools/java/org/unicode/cldr/util/data/) - provides a description of what the field is, for translators. 1. If it needs more explanation, add a section (or perhaps a whole page) to the translation guide, eg http://cldr.org/translation/plurals. 2. For an example, see [8479](https://cldr.unicode.org/index/bug-reports#TOC-Filing-a-Ticket) @@ -83,26 +83,26 @@ Add the annotations. 1. If the value has placeholders ({0}, {1},...) then edit this file as described in [Placeholders](https://cldr.unicode.org/development/updating-dtds). 4. The coverageLevels.xml (common/supplemental/coverageLevels) - sets the coverage level for the path. 1. **\[TBD - John\]** - 5. *Making sure paths are visible.* + 5. *Making sure paths are visible.* 1. There are 3 ways for paths to show up in ST even though there are no values in root. See Visible Paths below - 2. **Examples:** For any value that has placeholders, or is used in other values that have placeholders, add handling code to the **test/ExampleGenerator** so that survey tool users see examples of your structure in place. + 2. **Examples:** For any value that has placeholders, or is used in other values that have placeholders, add handling code to the **test/ExampleGenerator** so that survey tool users see examples of your structure in place. 3. **Cleaning up input.** If there are things you can do to fix the user data on entry, add to **test/DisplayAndInputProcessor** 3. **Survey Tool Tests.** Add those needed to CheckCLDR 1. In particular, add to CheckNew so that people see it **\[TBD, fix this advice\]** 1. If the user's input could be bad, add a survey test to one or more of the tests subclassed from CheckCLDR, to check for bad user input. - 1. Look at test/**CheckDates** to see how this is done. - 2. Run test/**ConsoleCheckCLDR** with various types of invalid input to make sure that they fail. + 1. Look at test/**CheckDates** to see how this is done. + 2. Run test/**ConsoleCheckCLDR** with various types of invalid input to make sure that they fail. 2. To update the casing files used by CheckConsistentCasing , run org.unicode.cldr.test.CasingInfo -l \ which will update the casing files in common/casing. When you check this in, sanity check the values, because in some cases we have have had different rules than just what the heuristics generate. 3. TEST out the **SurveyTool** to verify that you can see/edit the new items. If users should be able to input data and are not able to, the item has not been properly added to CLDR. See [Running the Survey Tool in Eclipse](https://cldr.unicode.org/development/running-survey-tool). 4. **Data.** - 1. Add necessary data to root and English. + 1. Add necessary data to root and English. 2. (Optional) add additional data for locales (if part of main). If the data is just seed data (that you aren't sure of), make sure that you have draft="unconfirmed" on the leaf nodes. - + ### supplementalData.dtd 1. Add code to util/SupplementalDataInfo to fetch the data. 2. You should develop a chart program that shows your data in http://www.unicode.org/cldr/data/charts/supplemental/index.html - + ### Structure Requirements @@ -114,9 +114,9 @@ We never have "mixed" content. That is, no element values can occur in anything There is a strong distinction between *rule elements and structure elements*. Example: in collations you have \

x\

\

y\

representing x < y. Clearly changing the order would cause problems! There are restrictions on this, however: -1. Rule elements must be written in the same order they are read. -2. They can't inherit. -3. You can't (easily) add to them programmatically. +1. Rule elements must be written in the same order they are read. +2. They can't inherit. +3. You can't (easily) add to them programmatically. 4. You can't mix rule and structure elements under the same parent element. That is, if you can have \\...\\...\\, then either y and z must *both* be rule or *both* be structure elements. 5. In our code, rule elements have their ordering preserved by adding a fake attribute added when reading, \_q="nnn". 6. The CLDRFile code has a list of these, in the right order, as **orderedElements**. If you ever add an rule element to a DTD, you MUST add it there. Be careful to preserve the above invariants. @@ -127,7 +127,7 @@ In order to write out an XML file correctly, we also have to know the valid orde The subelements of an element will vary between \* and ?. Note however that all leaf nodes MUST allow for the attributes alt=... draft=... and references=.... So that the alt can work, the leaf nodes MUST occur in their parent as \*, not ?, even if logically there can be only one. For example, even though logically there is only a single quotationStart, we see: - \ the type is a distinguishing attribute. The **non-distinguishing** attributes instead carry information, and aren't relevant to the identity of the path, nor are they used in the ordering above. ***Non-distinguishing elements in the ldml DTD cause problems: try to design all future DTD structure to avoid them; put data in element values, not attribute values.*** It is ok to have data in attributes in the other DTDs. The distinction between the distinguishing and non-distinguishing elements is captured in the distinguishingData in CLDRFile. So by default, always put new ldml attributes in this array. - *(Note: we should change this to be exclusive instead of inclusive, to reduce the possibility for error.)* - + #### Attribute Values We use some default attribute values in our DTD, such as - \ - + This was a mistake, since it makes the interpretation of the file depend on the DTD; we might fix it some day, maybe if we go to Relax, but for now just don't introduce any more of these. It also means that we have a table in CLDRFile with these values: defaultSuppressionMap. When you make a draft attribute on a new element, don't copy the old ones like this: @@ -177,7 +177,7 @@ That is, we *don't* want the deprecated values on new elements. Just make it: The DTD cannot do anything like the level of testing for legitimate values that we need, so supplemental data also has a set of attributeValueValidity.xml data for checking attribute values. For example, we see: - \$\_bcp47\_calendar\ - + This means that whenever you see any matching dtd/element/attribute combination, it can be tested for a list of values that are contained in the variable \$\_bcp47\_calendar. Some of these variables are lists, and some are regex, and some (those with $\_) are generated internally from other information. When you add a new attribute to ldml, you must add a \ element unless it is a closed set. @@ -209,11 +209,11 @@ These are also in the header of PathHeader.txt: - \# Be careful, order matters. It is used to determine the order on the page and in menus. Also, be sure to put longer matches first, unless terminated with $. - \# The quoting of \\\[ is handled automatically, as is alt=X - - \# If you add new paths, change @type="..." => @type="%A" - - \# The syntax &function(data) means that a function generates both the string and the ordering. The functions MUST be supported in PathHeader.java - - \# The only function that can be in Page right now are &metazone and &calendar, and NO functions can be in Section + - \# If you add new paths, change @type="..." => @type="%A" + - \# The syntax &function(data) means that a function generates both the string and the ordering. The functions MUST be supported in PathHeader.java + - \# The only function that can be in Page right now are &metazone and &calendar, and NO functions can be in Section - \# A \* at the front (like \*$1) means to not change the sorting group. - + There are a set of variables at the top of the file. These all are in parens, so the %A, %E, and %E correspond to the $1, $2, and $3 in the \
; \ ; \
; \ The order of the section and page is determined by the enums in the PathHeader.java file. So the \
and \ must correspond to those enum values. @@ -249,14 +249,14 @@ The return value is the appearance to the user. For example, the following chang If a value has placeholders, edit Placeholders.txt: 1. Add 1 item per placeholder, with the form - - \ ; {0}=\ \ ; {1}=\ \ ... - - ^//ldml/units/unit\\\[@type="day%A"\]/unitPattern ; {0}=NUMBER\_OF\_DAYS 3 + - \ ; {0}=\ \ ; {1}=\ \ ... + - ^//ldml/units/unit\\\[@type="day%A"\]/unitPattern ; {0}=NUMBER\_OF\_DAYS 3 2. There is a variable %A that will match attribute value syntax (or substrings). 3. \ may contain spaces, but \ must not. 4. For an example, see [8484](https://cldr.unicode.org/index/bug-reports#TOC-Filing-a-Ticket) 5. Check that the ConsoleCheckCLDR **CheckForExamplars** fails if there are no placeholders in the value 6. Note: we should switch methods so that we don't need to quote \\\[, etc, but we haven't yet. - + ## PathDescription This file provides a description of each kind of path, and a link to a section of https://cldr.unicode.org/translation. Easiest is to take an existing description and modify. @@ -283,7 +283,7 @@ Modify the following files as described in [ldml2icu\_readme.txt](https://home.u 1. ldml2icu\_locale.txt and/or 2. ldml2icu\_supplemental.txt - + Unfortunately, you have to change input parameters to get the different kinds of generated files. Here's an example: \-s {workspace-cldr}/common/supplemental @@ -310,7 +310,7 @@ There are three ways for paths to show up in the Survey Tool (and in other tooli - Check to make sure that all of the special alt values in en.xml are there. 1. **extraPaths.** This is used for algorithmically computed paths *that **do** depend on the locale*. For example, we generate count values based on the plural rules. The 'other' form must be in root, but all other forms are calculated here. This should not be overused, since it is recalculated dynamically, whereas root and code\_fallback are constant over the life of the ST. - To modify, look at CLDRFile.getRawExtraPaths(). - + ### Gotchas @@ -318,7 +318,7 @@ There are three ways for paths to show up in the Survey Tool (and in other tooli - **PathHeader:** Special items are suppressed (they all have HIDE on them). This is used for all paths that don't vary by locale. Paths can also be marked as having unmodifiable values. - **Coverage:** If a path has too high a coverage level, then it will be hidden. - **Other stuff?** \[Steven to fill out\]. - + ### OK if Missing @@ -328,7 +328,7 @@ Certain paths don't have to be present in locales. They are not counted as Missi The following is an example of the different files that may need to be modified. It has both count= and a placeholder, so it hits most of the kinds of changes. - https://cldr.unicode.org/index/bug-reports#TOC-Filing-a-Ticket - + ## Modifying English/Root @@ -337,10 +337,9 @@ Whenever you modify values in English or Root, be sure to run GenerateBirth as d ## Validation - **Do the steps on** [**Running Tests**](https://cldr.unicode.org/development/running-tests) - + ## Debugging Regexes - Moved to [**Running Tests**](https://cldr.unicode.org/development/running-tests) -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/downloads/brs-copy-en_gb-to-en_001.md b/docs/site/downloads/brs-copy-en_gb-to-en_001.md index bdb4bc2bb4d..9561f97474f 100644 --- a/docs/site/downloads/brs-copy-en_gb-to-en_001.md +++ b/docs/site/downloads/brs-copy-en_gb-to-en_001.md @@ -10,18 +10,17 @@ The program **CompareEn.java** can be used to copy data from en\_GB up to en\_00 Options: - \-u (uplevel) — move elements from en\_GB into en\_oo1. By default, the output directory is common/main and common/annotations in trunk - - If not present, just write a comparison to Generated/cldr/comparison/en.txt + - If not present, just write a comparison to Generated/cldr/comparison/en.txt - \-v (verbose) — provide verbose output - + 1. Run with no options first. - 1. That generates a file that indicates what changes would be made. - 2. Put that file in a spreadsheet - 3. Post to the CLDR TC for review. - 4. You'll then want to retract any items that shouldn't be copied. - 5. Change CompareEn.java if there are paths that should be skipped in the future. -2. Once you agree on the results, you'll run -u. - 1. That will modify your local copy of en\_oo1.xml - 2. Then do a diff with HEAD to make sure it matches expectations + 1. That generates a file that indicates what changes would be made. + 2. Put that file in a spreadsheet + 3. Post to the CLDR TC for review. + 4. You'll then want to retract any items that shouldn't be copied. + 5. Change CompareEn.java if there are paths that should be skipped in the future. +2. Once you agree on the results, you'll run -u. + 1. That will modify your local copy of en\_oo1.xml + 2. Then do a diff with HEAD to make sure it matches expectations 3. Then check in en\_oo1.xml and CompareEn.java -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/index.md b/docs/site/index.md index ea83ac87700..1f31333fd94 100644 --- a/docs/site/index.md +++ b/docs/site/index.md @@ -6,37 +6,37 @@ title: Unicode CLDR Project ## News -- **2024-05-14 CLDR v46 - [Survey tool open for general submission](https://cldr.unicode.org/translation)** -- **2024-04-17 [CLDR v45](https://cldr.unicode.org/index/downloads/cldr-45) released** -- **2023-12-13 [CLDR v44.1](https://cldr.unicode.org/index/downloads/cldr-44#h.nvqx283jwsx) released (an update to CLDR v44)** +- **2024-05-14 CLDR v46 - [Survey tool open for general submission](https://cldr.unicode.org/translation)** +- **2024-04-17 [CLDR v45](https://cldr.unicode.org/index/downloads/cldr-45) released** +- **2023-12-13 [CLDR v44.1](https://cldr.unicode.org/index/downloads/cldr-44#h.nvqx283jwsx) released (an update to CLDR v44)** - **2023-10-31 [CLDR v44](https://cldr.unicode.org/index/downloads/cldr-44) released** - + ## What is CLDR? The Unicode Common Locale Data Repository (CLDR) provides key building blocks for software to support the world's languages, with the largest and most extensive standard repository of locale data available. This data is used by a [wide spectrum of companies](https://cldr.unicode.org/index#h.ezpykkomyltl) for their software internationalization and localization, adapting software to the conventions of different languages for such common software tasks. It includes: -- **Locale-specific patterns for formatting and parsing:** dates, times, timezones, numbers and currency values, measurement units,… -- **Translations of names:** languages, scripts, countries and regions, currencies, eras, months, weekdays, day periods, time zones, cities, and time units, emoji characters and sequences (and search keywords),… -- **Language & script information:** characters used; plural cases; gender of lists; capitalization; rules for sorting & searching; writing direction; transliteration rules; rules for spelling out numbers; rules for segmenting text into graphemes, words, and sentences; keyboard layouts… -- **Country information:** language usage, currency information, calendar preference, week conventions,… +- **Locale-specific patterns for formatting and parsing:** dates, times, timezones, numbers and currency values, measurement units,… +- **Translations of names:** languages, scripts, countries and regions, currencies, eras, months, weekdays, day periods, time zones, cities, and time units, emoji characters and sequences (and search keywords),… +- **Language & script information:** characters used; plural cases; gender of lists; capitalization; rules for sorting & searching; writing direction; transliteration rules; rules for spelling out numbers; rules for segmenting text into graphemes, words, and sentences; keyboard layouts… +- **Country information:** language usage, currency information, calendar preference, week conventions,… - **Validity:** Definitions, aliases, and validity information for Unicode locales, languages, scripts, regions, and extensions,… - + CLDR uses the XML format provided by [UTS #35: Unicode Locale Data Markup Language (LDML)](http://www.unicode.org/reports/tr35/). LDML is a format used not only for CLDR, but also for general interchange of locale data, such as in Microsoft's .NET. ## Who uses CLDR? Some of the companies and organizations that use CLDR are: -- Apple (macOS, iOS, watchOS, tvOS, and several applications; Apple Mobile Device Support and iTunes for Windows; …) -- Google (Web Search, Chrome, Android, Adwords, Google+, Google Maps, Blogger, Google Analytics, …) -- IBM (DB2, Lotus, Websphere, Tivoli, Rational, AIX, i/OS, z/OS, …) +- Apple (macOS, iOS, watchOS, tvOS, and several applications; Apple Mobile Device Support and iTunes for Windows; …) +- Google (Web Search, Chrome, Android, Adwords, Google+, Google Maps, Blogger, Google Analytics, …) +- IBM (DB2, Lotus, Websphere, Tivoli, Rational, AIX, i/OS, z/OS, …) - Meta (Facebook, Messenger, WhatsApp, …) - Microsoft (Windows, Office, Visual Studio, …) *and many others, including:* - ABAS Software, Adobe, Amazon (Kindle), Amdocs, Apache, Appian, Argonne National Laboratory, Avaya, Babel (Pocoo library), BAE Systems Geospatial eXploitation Products, BEA, BluePhoenix Solutions, BMC Software, Boost, BroadJump, Business Objects, caris, CERN, CLDR Engine, Debian Linux, Dell, Eclipse, eBay, elixir-cldr, EMC Corporation, ESRI, Firebird RDBMS, FreeBSD, Gentoo Linux, GroundWork Open Source, GTK+, Harman/Becker Automotive Systems GmbH, HP, Hyperion, Inktomi, Innodata Isogen, Informatica, Intel, Interlogics, IONA, IXOS, Jikes, jQuery, Library of Congress, Mathworks, Mozilla, Netezza, OpenOffice, Oracle (Solaris, Java), Lawson Software, Leica Geosystems GIS & Mapping LLC, Mandrake Linux, OCLC, Perl, Progress Software, Python, Qt, QNX, Rogue Wave, SAP, Shutterstock, SIL, SPSS, Software AG, SuSE, Symantec, Teradata (NCR), ToolAware, Trend Micro, Twitter, Virage, webMethods, Wikimedia Foundation (Wikipedia), Wine, WMS Gaming, XyEnterprise, Yahoo!, Yelp - + There are other projects which consume cldr-json directly, see [here](https://github.com/unicode-org/cldr-json/blob/master/USERS.md#projects) for a list. @@ -81,8 +81,7 @@ The two important periods for translators are: - Submission: translators are asked to flesh out missing data, and check for consistency. - Vetting: translators are asked to review all changed or conflicted values, and reach consensus. - + The details for the current release are found in [Current CLDR Cycle](https://docs.google.com/spreadsheets/d/1N6inI5R84UoYlRwuCNPBOAP7ri4q2CmJmh8DC5g-S6c/edit#gid=1680747936). -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/index/acknowledgments.md b/docs/site/index/acknowledgments.md index 1b7cfc50ae5..30549bd100a 100644 --- a/docs/site/index/acknowledgments.md +++ b/docs/site/index/acknowledgments.md @@ -2068,4 +2068,3 @@ The first CLDR version under the sponsorship of the Unicode Consortium was versi Thanks to the following people for their contributions to the CLDR 1.0 and LDML 1.0: Helena Shih Chapman, Mark Davis, Simon Dean, Deborah Goldsmith, Steven R Loomis, Kentaroh Noji, George Rhoten, Baldev Soor, Michael Twomey, Ram Viswanadha and Vladimir Weinstein. Special thanks to Akio Kido, Hideki Hiura, Tom Garland, and the OpenI18N organization for the sponsorship of this activity, and to the ICU team for hosting the CVS repository and collecting and managing the data for the project. -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/index/bcp47-extension.md b/docs/site/index/bcp47-extension.md index ce9b98107d2..ca83d654500 100644 --- a/docs/site/index/bcp47-extension.md +++ b/docs/site/index/bcp47-extension.md @@ -6,9 +6,9 @@ title: Unicode Extensions for BCP 47 [IETF BCP 47 *Tags for Identifying Languages*](https://www.rfc-editor.org/info/bcp47) defines the language identifiers (tags) used on the Internet and in many standards. It has an extension mechanism that allows additional information to be included. The Unicode Consortium is the maintainer of the extension ‘u’ for Locale Extensions, as described in [rfc6067](https://datatracker.ietf.org/doc/html/rfc6067), and the extension 't' for Transformed Content, as described in [rfc6497](https://datatracker.ietf.org/doc/html/rfc6497). -- The subtags available for use in the 'u' extension provide language tag extensions that provide for additional information needed for identifying locales. The 'u' subtags consist of a set of keys and associated values (types). For example, a locale identifier for British English with numeric collation has the following form: en-GB-**u-kn-true** +- The subtags available for use in the 'u' extension provide language tag extensions that provide for additional information needed for identifying locales. The 'u' subtags consist of a set of keys and associated values (types). For example, a locale identifier for British English with numeric collation has the following form: en-GB-**u-kn-true** - The subtags available for use in the 't' extension provide language tag extensions that provide for additional information needed for identifying transformed content, or a request to transform content in a certain way. For example, the language tag "ja-Kana-t-it" can be used as a content tag indicates Japanese Katakana transformed from Italian. It can also be used as a request for a given transformation. - + For more details on the valid subtags for these extensions, their syntax, and their meanings, see LDML Section 3.7 [*Unicode BCP 47 Extension Data*](https://www.unicode.org/reports/tr35/#Locale_Extension_Key_and_Type_Data). @@ -42,4 +42,3 @@ Each release has an associated data directory of the form "http://unicode.org/Pu For each release after CLDR 1.8, types introduced in that release are also marked in the data files by the XML attribute "since", such as in the following example: \ -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/index/charts.md b/docs/site/index/charts.md index 6a896612019..df10a4aba4e 100644 --- a/docs/site/index/charts.md +++ b/docs/site/index/charts.md @@ -18,26 +18,25 @@ Most charts have "double links" somewhere in each row. These are links that put ### Version Deltas -- [**Delta Data**](https://www.unicode.org/cldr/charts/latest/delta/index.html) - Data that changed in the current release. +- [**Delta Data**](https://www.unicode.org/cldr/charts/latest/delta/index.html) - Data that changed in the current release. - [**Delta DTDs**](https://www.unicode.org/cldr/charts/latest/supplemental/dtd_deltas.html) - Differences between CLDR DTD's over time. - + ### Locale-Based Data -- [**Verification**](https://www.unicode.org/cldr/charts/latest/verify/index.html) - Constructed data for verification: Dates, Timezones, Numbers -- [**Summary**](https://www.unicode.org/cldr/charts/latest/summary/root.html) - Provides a summary view of the main locale data. Language locales (those with no territory or variant) are presented with fully resolved data; the inherited or aliased data can be hidden if desired. Other locales do not show inherited or aliased data, just the differences from the respective language locale. The English value is provided for comparison (shown as "=" if it is equal to the localized value, and n/a if not available). The Sublocales column shows variations across locales. Hovering over each Sublocale value shows a pop-up with the locales that have that value. -- [**By-Type**](https://www.unicode.org/cldr/charts/latest/by_type/index.html) - provides a side-by-side comparison of data from different locales for each field. For example, one can see all the locales that are left-to-right, or all the different translaitons of the Arabic script across languages. Data that is unconfimred or provisional is marked by a red-italic locale ID, such as *·bn\_BD·*. -- [**Character Annotations**](https://www.unicode.org/cldr/charts/latest/annotations/index.html) - The CLDR emoji character annotations. -- [**Subdivision Names**](https://www.unicode.org/cldr/charts/latest/subdivisionNames/index.html) - The (draft) CLDR subdivision names (names for states, provinces, cantons, etc.). +- [**Verification**](https://www.unicode.org/cldr/charts/latest/verify/index.html) - Constructed data for verification: Dates, Timezones, Numbers +- [**Summary**](https://www.unicode.org/cldr/charts/latest/summary/root.html) - Provides a summary view of the main locale data. Language locales (those with no territory or variant) are presented with fully resolved data; the inherited or aliased data can be hidden if desired. Other locales do not show inherited or aliased data, just the differences from the respective language locale. The English value is provided for comparison (shown as "=" if it is equal to the localized value, and n/a if not available). The Sublocales column shows variations across locales. Hovering over each Sublocale value shows a pop-up with the locales that have that value. +- [**By-Type**](https://www.unicode.org/cldr/charts/latest/by_type/index.html) - provides a side-by-side comparison of data from different locales for each field. For example, one can see all the locales that are left-to-right, or all the different translaitons of the Arabic script across languages. Data that is unconfimred or provisional is marked by a red-italic locale ID, such as *·bn\_BD·*. +- [**Character Annotations**](https://www.unicode.org/cldr/charts/latest/annotations/index.html) - The CLDR emoji character annotations. +- [**Subdivision Names**](https://www.unicode.org/cldr/charts/latest/subdivisionNames/index.html) - The (draft) CLDR subdivision names (names for states, provinces, cantons, etc.). - [**Collation Tailorings**](https://www.unicode.org/cldr/charts/latest/collation/index.html) - Collation charts (draft) for CLDR locales. - + Other Data -- [**Supplemental Data**](https://www.unicode.org/cldr/charts/latest/supplemental/index.html) - General data that is not part of the locale hierarchy but is still part of CLDR. Includes: *plural rules, day-period rules, language matching, language-script information, territories (countries),* and their *subdivisions, timezones,* and so on. -- **Transform** - (Disabled temporarily) Some of the transforms in CLDR: the transliterations between different scripts. For more on transliterations, see [Transliteration Guidelines](https://cldr.unicode.org/index/cldr-spec/transliteration-guidelines). -- [**Keyboards**](https://www.unicode.org/cldr/charts/latest/keyboards/index.html) - Provides a view of keyboard data: layouts for different locales, mappings from characters to keyboards, and from keyboards to characters. +- [**Supplemental Data**](https://www.unicode.org/cldr/charts/latest/supplemental/index.html) - General data that is not part of the locale hierarchy but is still part of CLDR. Includes: *plural rules, day-period rules, language matching, language-script information, territories (countries),* and their *subdivisions, timezones,* and so on. +- **Transform** - (Disabled temporarily) Some of the transforms in CLDR: the transliterations between different scripts. For more on transliterations, see [Transliteration Guidelines](https://cldr.unicode.org/index/cldr-spec/transliteration-guidelines). +- [**Keyboards**](https://www.unicode.org/cldr/charts/latest/keyboards/index.html) - Provides a view of keyboard data: layouts for different locales, mappings from characters to keyboards, and from keyboards to characters. For more details on the locale data collection process, please see the [CLDR process](https://cldr.unicode.org/index/process). For filing or viewing bug reports, see [CLDR Bug Reports](https://github.com/unicode-org/cldr/blob/main/docs/requesting_changes.md). -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/index/cldr-presentations.md b/docs/site/index/cldr-presentations.md index 13f1d218823..a0cbd73c3d9 100644 --- a/docs/site/index/cldr-presentations.md +++ b/docs/site/index/cldr-presentations.md @@ -11,4 +11,3 @@ October 2021 - [Inflection Points](https://docs.google.com/presentation/d/e/2PACX-1vQLTz0yBlMi7FPBaNLRUiz0VZru5P1rkd3YQev2_VPqM-0ZNoHKQpuF9ll9bO1ynBCraBFQfH8OIfXP/pub?start=false&loop=false&delayms=3000) - [CLDR and Person Names](https://docs.google.com/presentation/d/e/2PACX-1vQ3t_4YIjPzLsKoZLuQjRhiK4QHoKLzjTif9Fabxc0l34chRt9ff7V_8gnvQdCJ1w/pub) -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/index/cldr-spec.md b/docs/site/index/cldr-spec.md index 689df39dc9d..94445ef97f8 100644 --- a/docs/site/index/cldr-spec.md +++ b/docs/site/index/cldr-spec.md @@ -14,4 +14,3 @@ title: CLDR Specifications - [Collation Guidelines](https://cldr.unicode.org/index/cldr-spec/collation-guidelines) - how to construct collation rules - [JSON Bindings for CLDR Data](https://cldr.unicode.org/index/cldr-spec/cldr-json-bindings) -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/index/cldr-spec/collation-guidelines.md b/docs/site/index/cldr-spec/collation-guidelines.md index e9c65468728..482037def72 100644 --- a/docs/site/index/cldr-spec/collation-guidelines.md +++ b/docs/site/index/cldr-spec/collation-guidelines.md @@ -4,9 +4,9 @@ title: Collation Guidelines # Collation Guidelines -Collation sequences can be quite tricky to specify. +Collation sequences can be quite tricky to specify. -The locale\-based collation rules in Unicode CLDR specify customizations of the standard data for [UTS \#10: Unicode Collation Algorithm](http://www.unicode.org/reports/tr10/#Introduction) (UCA). Requests to change the collation order for a given locale, or to supply additional variants, need to follow the guidelines in this document. +The locale\-based collation rules in Unicode CLDR specify customizations of the standard data for [UTS \#10: Unicode Collation Algorithm](http://www.unicode.org/reports/tr10/#Introduction) (UCA). Requests to change the collation order for a given locale, or to supply additional variants, need to follow the guidelines in this document. ## Filing a Request @@ -90,7 +90,7 @@ Primary Test 1. ...α...Z 2. ...β...A -That is, the words are identical except for α, β, A, and Z, *and* you know that A and Z have a clear primary difference. If we get the above ordering in dictionaries and other sources, you know that the difference between α and β is a primary difference. If we get the opposite ordering than 1,2 above, then you only know that the difference between α and β is *not* a primary difference: it may be secondary or tertiary. +That is, the words are identical except for α, β, A, and Z, *and* you know that A and Z have a clear primary difference. If we get the above ordering in dictionaries and other sources, you know that the difference between α and β is a primary difference. If we get the opposite ordering than 1,2 above, then you only know that the difference between α and β is *not* a primary difference: it may be secondary or tertiary. You now need to distinguish which of the non\-primary level differences you could have. So try again, this time seeing if you can find examples of two words that of the following form, where you know that A and Á have a clear secondary difference in the script. @@ -245,4 +245,3 @@ There are a number of pitfalls with collation, so be careful. In some cases, suc 2. **Blocking Contractions.** Contractions can be blocked with CGJ, as described in the Unicode Standard and in the [Characters and Combining Marks FAQ](http://www.unicode.org/faq/char_combmark.html). 3. **Case Combinations.** The lowercase, titlecase, and uppercase variants of contractions need to be supplied, with tertiary differences in that order (regardless of the caseFirst setting). That is, if *ch* is a contraction, then you would have the rules `... ch <<< Ch <<< CH`. Other case variants such as *cH* are excluded because they are unlikely to represent the contraction, for example in *McHugh*. (Therefore, *mchugh* and *McHugh* will be primary different if *ch* adds a primary difference.) \[[\#8248](http://unicode.org/cldr/trac/ticket/8248)] -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/index/cldr-spec/core-data-for-new-locales.md b/docs/site/index/cldr-spec/core-data-for-new-locales.md index b2d1c813989..94b859bce6a 100644 --- a/docs/site/index/cldr-spec/core-data-for-new-locales.md +++ b/docs/site/index/cldr-spec/core-data-for-new-locales.md @@ -30,4 +30,3 @@ Collect and submit the following data, using the [Core Data Submission Form](htt For more information on the other coverage levels refer to [Coverage Levels](https://cldr.unicode.org/index/cldr-spec/coverage-levels)  -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/index/cldr-spec/coverage-levels.md b/docs/site/index/cldr-spec/coverage-levels.md index 4b70ea1228f..95e895f9c27 100644 --- a/docs/site/index/cldr-spec/coverage-levels.md +++ b/docs/site/index/cldr-spec/coverage-levels.md @@ -12,7 +12,7 @@ You can use the file **common/properties/coverageLevels.txt** (added in v41\) fo The file format is semicolon delimited, with 3 fields per line. - + ```Locale ID ; Coverage Level ; Name``` Each locale ID also covers all the locales that inherit from it. So to get locales at a desired coverage level or above, the following process is used. @@ -105,4 +105,3 @@ For the coverage in the latest released version of CLDR, see [Locale Coverage Ch To see the development version of the rules used to determine coverage, see [coverageLevels.xml](https://github.com/unicode-org/cldr/blob/main/common/supplemental/coverageLevels.xml). For a list of the locales at a given level, see [coverageLevels.txt](https://github.com/unicode-org/cldr/blob/main/common/properties/coverageLevels.txt).  -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/index/cldr-spec/currency-process.md b/docs/site/index/cldr-spec/currency-process.md index c721c516b7c..663c4bb690d 100644 --- a/docs/site/index/cldr-spec/currency-process.md +++ b/docs/site/index/cldr-spec/currency-process.md @@ -15,4 +15,3 @@ There are three stages for new currency symbols (such as the recent Russian, Ind For more information, see [Currency Symbols \& Names](https://cldr.unicode.org/translation/currency-names-and-symbols/currency-names). -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/index/cldr-spec/definitions.md b/docs/site/index/cldr-spec/definitions.md index a3b45a895ea..0647a960901 100644 --- a/docs/site/index/cldr-spec/definitions.md +++ b/docs/site/index/cldr-spec/definitions.md @@ -20,4 +20,3 @@ Official languages for a country are not necessarily the same as those with offi ***official minority language*** \- a language that has some official governmental status, but is not an official language of the country or of a substantial region. -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/index/cldr-spec/picking-the-right-language-code.md b/docs/site/index/cldr-spec/picking-the-right-language-code.md index 3bb83669538..bcef5c5aa58 100644 --- a/docs/site/index/cldr-spec/picking-the-right-language-code.md +++ b/docs/site/index/cldr-spec/picking-the-right-language-code.md @@ -90,4 +90,3 @@ The Ethnologue is a great source of information, but it must be approached with Wikipedia is also a great source of information, but it must be approached with a certain degree of caution as well. Be sure to follow up on references, not just look at articles. -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/index/cldr-spec/plural-rules.md b/docs/site/index/cldr-spec/plural-rules.md index 6e1b96d7eaf..99c8ed7ac50 100644 --- a/docs/site/index/cldr-spec/plural-rules.md +++ b/docs/site/index/cldr-spec/plural-rules.md @@ -332,4 +332,3 @@ other: books 1  ➞ **book**, 2 ➞ **booku**, 3  ➞​ books​ -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/index/cldr-spec/transliteration-guidelines.md b/docs/site/index/cldr-spec/transliteration-guidelines.md index 9cf742bff67..42f7685bfce 100644 --- a/docs/site/index/cldr-spec/transliteration-guidelines.md +++ b/docs/site/index/cldr-spec/transliteration-guidelines.md @@ -351,4 +351,3 @@ For more information, see: - [ISCII\-91](http://www.cdacindia.com/html/gist/down/iscii_d.asp) - [UTS \#35: Locale Data Markup Language (LDML)](http://www.unicode.org/reports/tr35/) -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/index/corrigenda.md b/docs/site/index/corrigenda.md index 7da7241fe63..8feabe5dc5c 100644 --- a/docs/site/index/corrigenda.md +++ b/docs/site/index/corrigenda.md @@ -15,4 +15,4 @@ At this time, there is only one Corrigendum: Each release of CLDR is a stable release and may be used as reference material or cited as a normative reference by other specifications. Each version, once published, is absolutely stable and will never change. However, implementations may - and are encouraged to - apply fixes for corrigenda and errata to their use of an appropriate version. For example, an implementation may claim conformance to "CLDR 1.3, as amended by Corrigendum 1". -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) + diff --git a/docs/site/index/downloads.md b/docs/site/index/downloads.md index e1692b0a64c..6041991cd83 100644 --- a/docs/site/index/downloads.md +++ b/docs/site/index/downloads.md @@ -99,7 +99,7 @@ Access to the latest working snapshot of CLDR, and access to data collected for ## JSON Data - The JSON data is available at https://github.com/unicode-cldr/cldr-json - see that page for more information. - + ## Repository Access @@ -116,14 +116,14 @@ For browsing a particular file for a particular version, or revision history of - Go to the latest French LDML file at https://github.com/unicode-org/cldr/blob/master/common/main/fr.xml. - See all the files in a directory structure using https://github.com/unicode-org/cldr. - Find a file using https://github.com/unicode-org/cldr/find/master (click after "cldr /" above the blue box). - + ### Advanced Git Access For more access to the source repository, you can use an git client to check out or export LDML files directory from the repository at https://github.com/unicode-org/cldr.git - You will need "git-lfs" installed to be able to compile the CLDR tools. See https://git-lfs.github.com or use the [Github Desktop](https://desktop.github.com) client. - + ## Repository Organization @@ -131,42 +131,42 @@ At the top level of each GitHub repository tree, there are a number of special f - *common* — CLDR data corresponding to the release - *annotations* — annotations and TTS names for characters - - *annotationsDerived* — names algorithmically derived based on structure - - *bcp47* — data for unicode locale extensions - - *casing* — intended capitalization for various categories in each language, for use by the Survey Tool - - *collation* — collation LDML files - - *dtd* — the latest XML DTD files for the release - - *main* — the main locale-dependent LDML files - - *properties* — property files in UCD format - - *rbnf* — rule-based number formats - - *segments* — rules for segmenting text - - *subdivisions* — names of region (country) subdivisions. - - *supplemental* — additional files with non-linguistic data. - - *testData* — folders of test data for implementations. - - *transforms* — data for transliteration and other text transforms - - *uca* — customized Unicode collation data + - *annotationsDerived* — names algorithmically derived based on structure + - *bcp47* — data for unicode locale extensions + - *casing* — intended capitalization for various categories in each language, for use by the Survey Tool + - *collation* — collation LDML files + - *dtd* — the latest XML DTD files for the release + - *main* — the main locale-dependent LDML files + - *properties* — property files in UCD format + - *rbnf* — rule-based number formats + - *segments* — rules for segmenting text + - *subdivisions* — names of region (country) subdivisions. + - *supplemental* — additional files with non-linguistic data. + - *testData* — folders of test data for implementations. + - *transforms* — data for transliteration and other text transforms + - *uca* — customized Unicode collation data - *validity* — data for validating BCP47 identifiers -- *docs* — the source of the LDML spec and other documents -- *exemplars/main* — preliminary exemplar character data for locales which do not hav e -- *keyboards* — source files for the CLDR keyboard data - - *seed* — preliminary locales that do not yet have sufficient vetted data. - - *annotations* - - *casing* - - *collation* - - *main* - - *rbnf* - - *transforms* — these folders have the same structure as their counterparts in common. Note that supplemental is not duplicated. +- *docs* — the source of the LDML spec and other documents +- *exemplars/main* — preliminary exemplar character data for locales which do not hav e +- *keyboards* — source files for the CLDR keyboard data + - *seed* — preliminary locales that do not yet have sufficient vetted data. + - *annotations* + - *casing* + - *collation* + - *main* + - *rbnf* + - *transforms* — these folders have the same structure as their counterparts in common. Note that supplemental is not duplicated. - *~~specs~~* — deprecated, with contents moved to docs. -- [*tools*](https://github.com/unicode-org/cldr/tree/master/tools) — source for internal tools for processing CLDR data - - *SurveyConsole* — (not currently deployed) This is a tool providing an operational dashboard for the Survey Tool - - c/*genldml* — The only C language tool, this was used to convert ICU format data into LDML. - - *cldr-apps-watcher* — (not currently deployed) This is a tool which will watch the Survey Tool and ensure that it remains operational. - - [*cldr-apps*](https://github.com/unicode-org/cldr/tree/master/tools/cldr-apps) — Survey Tool source code - - *cldr-unittest* — Unit tests against the CLDR code in the “java” directory. (Not to be confused with CheckCLDR tests.) - - [*java*](https://github.com/unicode-org/cldr/tree/master/tools/java) — main source code for the CLDR tooling - - *python* — utility Python code +- [*tools*](https://github.com/unicode-org/cldr/tree/master/tools) — source for internal tools for processing CLDR data + - *SurveyConsole* — (not currently deployed) This is a tool providing an operational dashboard for the Survey Tool + - c/*genldml* — The only C language tool, this was used to convert ICU format data into LDML. + - *cldr-apps-watcher* — (not currently deployed) This is a tool which will watch the Survey Tool and ensure that it remains operational. + - [*cldr-apps*](https://github.com/unicode-org/cldr/tree/master/tools/cldr-apps) — Survey Tool source code + - *cldr-unittest* — Unit tests against the CLDR code in the “java” directory. (Not to be confused with CheckCLDR tests.) + - [*java*](https://github.com/unicode-org/cldr/tree/master/tools/java) — main source code for the CLDR tooling + - *python* — utility Python code - *scripts* — accessory shell scripts, used for CLDR process and Survey Tool deployment - + The common, dtd, and tools folders are in each release. @@ -190,4 +190,3 @@ The 1.0 version of CLDR is described here for historical interest only. It was h | 1.0 DTDs: | http://www.openi18n.org/spec/ldml/1.0/ldml.dtd | | | http://www.openi18n.org/spec/ldml/1.0/ldmlSupplemental.dtd | -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/index/downloads/cldr-43.md b/docs/site/index/downloads/cldr-43.md index e2cd2a50524..a9c7aa8b184 100644 --- a/docs/site/index/downloads/cldr-43.md +++ b/docs/site/index/downloads/cldr-43.md @@ -68,9 +68,9 @@ The following changes are included to allow for better compatibility with certai The only **DTD change** is the additional of alt\="ascii" for time formats: -\ -    \ -\ +\ +    \ +\     \ ## Data Changes @@ -171,7 +171,7 @@ See the Migration section for general data changes. ## Specification Changes Please see [Modifications](https://www.unicode.org/reports/tr35/tr35-68/tr35.html#Modifications) section in the LDML for full list of items: - + - Removed numbering from sections, to allow for more flexible reorganization of the specification in the future. - [Person Names](https://www.unicode.org/reports/tr35/tr35-68/tr35-personNames.html#Contents) - Brought Person Name Formatting out of tech preview. @@ -256,10 +256,9 @@ None currently. Many people have made significant contributions to CLDR and LDML; see the [Acknowledgments](https://cldr.unicode.org/index/acknowledgments) page for a full listing. - + The Unicode [Terms of Use](https://unicode.org/copyright.html) apply to CLDR data; in particular, see [Exhibit 1](https://unicode.org/copyright.html#Exhibit1). For web pages with different views of CLDR data, see . -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/index/downloads/cldr-44.md b/docs/site/index/downloads/cldr-44.md index 83d5563fb59..07d7a41a934 100644 --- a/docs/site/index/downloads/cldr-44.md +++ b/docs/site/index/downloads/cldr-44.md @@ -195,11 +195,11 @@ There were generally a relatively small number of additions this cycle; the focu - **PersonNames**: In the process of moving out of Tech Preview, there were structure additions but also changes: - The nameField type prefix was replaced with title, and the nameField type suffix was replaced with two new types generation and credentials. - The sampleName types givenOnly, givenSurnameOnly, given12Surname, full were replaced with new types separating samples for names in the locale from samples for foreign names: nativeG, nativeGS, nativeGGS, nativeFull, foreignG, foreignGS, foreignGGS, foreignFull -- **Redundant values that inherit “sideways” may be removed in production data**: Some data values inherit “sideways” from another element with the same parent, in the same locale. For example, consider the following items in the en locale, some added in CLDR 44 to provide clients a way to explicitly select a particular variant across locales (instead of using the default):
+- **Redundant values that inherit “sideways” may be removed in production data**: Some data values inherit “sideways” from another element with the same parent, in the same locale. For example, consider the following items in the en locale, some added in CLDR 44 to provide clients a way to explicitly select a particular variant across locales (instead of using the default):
\British Indian Ocean Territory\ \
-\British Indian Ocean Territory\ \
-\Chagos Archipelago\ \ -Both alt forms inherit sideways from the non\-alt form. Thus in this case, the "biot" variant is redundant and will be removed in production data. Clients that are trying to select the "biot" variant but find it missing should fall back to the non\-alt form. +\British Indian Ocean Territory\ \
+\Chagos Archipelago\ \ +Both alt forms inherit sideways from the non\-alt form. Thus in this case, the "biot" variant is redundant and will be removed in production data. Clients that are trying to select the "biot" variant but find it missing should fall back to the non\-alt form. Similar behavior occurs with plural forms for units, where some plural forms may match and thus fall back to the "other" form. - *Since the last release, Unicode updated its outbound license from the "[Unicode, Inc. License \- Data Files and Software](https://opensource.org/license/unicode-inc-license-agreement-data-files-and-software)" to the "[Unicode License v3](https://opensource.org/license/unicode-license-v3)". All of the substantive terms of the license remain the same. The only changes made were non\-substantive technical edits. The new license is OSI\-approved and has been assigned the SPDX Identifier Unicode\-3\.0\.* @@ -228,4 +228,3 @@ The Unicode [Terms of Use](https://unicode.org/copyright.html) apply to CLDR dat For web pages with different views of CLDR data, see . -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/index/draft-schedule.md b/docs/site/index/draft-schedule.md index 96b19a3aabc..236f6b36779 100644 --- a/docs/site/index/draft-schedule.md +++ b/docs/site/index/draft-schedule.md @@ -27,13 +27,12 @@ title: Draft Schedule | Feb | Production | 15 | Final candidate tagged | | Mar | Production | 15 | Release | | | | 30 | Survey Tool updated | - + The Spring release is intended to be less data-intensive, with a shorter vetting period because of the December holidays. The 2013 Spring release has just a limited internal data phase, because it needs to be shortened to adjust to the new schedule. Rather than having the Survey tool unavailable except during the Main Submission and Vetting periods, it would be available most of the year: It would be taken down during the Resolution phase, and occasionally for a week for updates or structural changes before the Main Submission phase. The two important periods for translators would be: -- Submission: translators are asked to flesh out missing data, and check for consistency. +- Submission: translators are asked to flesh out missing data, and check for consistency. - Vetting: translators are asked to review all changed or conflicted values, and reach consensus. -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/index/json-format-data.md b/docs/site/index/json-format-data.md index 09c315c2c04..7e4b5667e6f 100644 --- a/docs/site/index/json-format-data.md +++ b/docs/site/index/json-format-data.md @@ -18,4 +18,3 @@ The JSON specification for CLDR data can be found [here](https://cldr.unicode.or To access the CLDR 22.1 data in the JSON format, click [HERE](http://www.google.com/url?q=http%3A%2F%2Fwww.unicode.org%2Frepos%2Fcldr-aux%2Fjson%2F22.1%2F&sa=D&sntz=1&usg=AOvVaw3a81SHdsJi_Nf1zWGhF3rs) -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/index/keyboard-workgroup.md b/docs/site/index/keyboard-workgroup.md index 5e1c774fe54..ecea04db721 100644 --- a/docs/site/index/keyboard-workgroup.md +++ b/docs/site/index/keyboard-workgroup.md @@ -48,10 +48,10 @@ Keyboard support is part of a multi-step, often multi-year process of enabling a Three critical parts of initial support for a language in content are: -- Encoding, in [the Unicode Standard](https://www.unicode.org/standard/standard.html) +- Encoding, in [the Unicode Standard](https://www.unicode.org/standard/standard.html) - Display, including fonts and text layout - Input - + Today, the vast majority of the languages of the world are already in the Unicode encoding. The open-source Noto font provides a wide range of fonts to support display, and the Unicode character properties play a vital role in display. However, input support often lags many years behind when a script is added to Unicode. @@ -74,9 +74,8 @@ Updates to LDML (UTS#35) Part 7: Keyboards are scheduled to be released as part Implementations - The [SIL Keyman](https://keyman.com/ldml/) project is actively working on an open-source implementation of the LDML format. - + ### How can I get involved? If you want to be engaged in this workgroup, please contact the CLDR Keyboard Subcommittee via the [Unicode contact form](https://corp.unicode.org/reporting/staff-contact.html). -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/index/language-support-levels.md b/docs/site/index/language-support-levels.md index 97bdfde0f61..8669fca9326 100644 --- a/docs/site/index/language-support-levels.md +++ b/docs/site/index/language-support-levels.md @@ -6,13 +6,13 @@ title: Language Support Levels People often ask whether some device or application supports their language. This seems like a simple question: yes or no. But the reality is that there are different levels of support for a language, ranging from allowing the user to read their language on the platform all the way up to having a voice assistant in their language. -This page defines a common set of terminology for language support levels for platforms such as operating systems, browsers, etc. The goal is to have consistent terminology so that people can clearly indicate the level of support for a given language. +This page defines a common set of terminology for language support levels for platforms such as operating systems, browsers, etc. The goal is to have consistent terminology so that people can clearly indicate the level of support for a given language. The focus here is on the incremental changes necessary to add a language to a platform that is already a Unicode-Enabled Platform. Note that the term 'language' is used for familiarity, but what needs to be supported are [locales](https://www.google.com/url?q=https://unicode-org.github.io/cldr/ldml/tr35.html%23Unicode_Language_and_Locale_Identifiers&sa=D&source=editors&ust=1717551026933717&usg=AOvVaw3RPCbCtWzpEK4qpEXVzEtJ). ## Support Levels -1. Display - Text in the language can be read by users of the platform. +1. Display - Text in the language can be read by users of the platform. - **Characters** needed for the language are in Unicode. - **Fonts** supporting those characters are installed (or installable) on the platform - **The rendering system** supports the language’s script. @@ -126,4 +126,3 @@ Below are examples of selecting Cherokee on different systems. (Cherokee in Cher Cherokee is not typically a UI language for the OS, meaning the system isn't translated into it. So in practice a user must also select an alternative language such as English that will appear in the UI for any applications that don't support Cherokee. -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/index/locale-coverage.md b/docs/site/index/locale-coverage.md index f7e856cedd3..ded10876d7b 100644 --- a/docs/site/index/locale-coverage.md +++ b/docs/site/index/locale-coverage.md @@ -13,9 +13,9 @@ The following may be listed as Missing features in a Locale Coverage chart, such These are supplied or generated from data supplied in [Core Data for New Locales](https://cldr.unicode.org/index/cldr-spec/core-data-for-new-locales). 1. ***default\_content** — required in supplied core data* -2. ***country\_data** — required in supplied core data* -3. ***time\_cycle** — required in supplied core data* -4. ***likely\_subtags*** — Based on the language population data, a likely subtag mapping is generated. For example, from "de" the likely subtags are "de\_Latn\_DE". +2. ***country\_data** — required in supplied core data* +3. ***time\_cycle** — required in supplied core data* +4. ***likely\_subtags*** — Based on the language population data, a likely subtag mapping is generated. For example, from "de" the likely subtags are "de\_Latn\_DE". 5. ***orientation*** — generated from exemplar data ### Moderate @@ -23,16 +23,15 @@ These are supplied or generated from data supplied in [Core Data for New Locales The following are needed at the Moderate level. *The first three should be present before submitting other moderate data* 1. ***casing** — for bicameral scripts, what is the normal casing for different kinds of fields (country names, language names, etc). Used internally in the Survey Tool.* -2. ***plurals** — the number of different plural forms, and the rules for deriving them. See [Plural Rules](https://cldr.unicode.org/index/cldr-spec/plural-rules)* and [Plurals & Units](https://cldr.unicode.org/translation/getting-started/plurals) -3. ***ordinals** — the number of different plural forms, and the rules for deriving them. See [Plural Rules](https://cldr.unicode.org/index/cldr-spec/plural-rules)* and [Plurals & Units](https://cldr.unicode.org/translation/getting-started/plurals) +2. ***plurals** — the number of different plural forms, and the rules for deriving them. See [Plural Rules](https://cldr.unicode.org/index/cldr-spec/plural-rules)* and [Plurals & Units](https://cldr.unicode.org/translation/getting-started/plurals) +3. ***ordinals** — the number of different plural forms, and the rules for deriving them. See [Plural Rules](https://cldr.unicode.org/index/cldr-spec/plural-rules)* and [Plurals & Units](https://cldr.unicode.org/translation/getting-started/plurals) 4. ***collation** — rules for the sorting order for a language.* - + ### Modern The following are needed at the Modern level. *The **grammar** should be present before adding grammatical forms (eg for units)* 1. ***grammar** — what are the grammatical forms used in a language, in particular usages.* - + 2. ***romanization** — what are rules for romanizing the language's script.* -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/index/process.md b/docs/site/index/process.md index eccd63cd543..2ea6937ad01 100644 --- a/docs/site/index/process.md +++ b/docs/site/index/process.md @@ -8,10 +8,10 @@ title: CLDR Process This document describes the Unicode CLDR Technical Committee's process for data collection, resolution, public feedback and release. -- The process is designed to be light-weight; in particular, the meetings are frequent, short, and informal. Most of the work is by email or phone, with a database recording requested changes (See [change request](http://cldr.unicode.org/index/bug-reports)). -- When gathering data for a region and language, it is important to have multiple sources for that data to produce the most commonly used data. The initial versions of the data were based on best available sources, and updates with new and improvements are released twice a year with work by contributors inside and outside of the Unicode Consortium. -- It is important to note that CLDR is a Repository, not a Registration. That is, contributors should NOT expect that their suggestions will simply be adopted into the repository; instead, it will be vetted by other contributors. -- The [CLDR Survey Tool](http://www.unicode.org/cldr/survey_tool.html) is the main channel for collecting data, and bug/feature request are tracked in a database ([CLDR Bug Reports](http://www.unicode.org/cldr/filing_bug_reports.html)). +- The process is designed to be light-weight; in particular, the meetings are frequent, short, and informal. Most of the work is by email or phone, with a database recording requested changes (See [change request](http://cldr.unicode.org/index/bug-reports)). +- When gathering data for a region and language, it is important to have multiple sources for that data to produce the most commonly used data. The initial versions of the data were based on best available sources, and updates with new and improvements are released twice a year with work by contributors inside and outside of the Unicode Consortium. +- It is important to note that CLDR is a Repository, not a Registration. That is, contributors should NOT expect that their suggestions will simply be adopted into the repository; instead, it will be vetted by other contributors. +- The [CLDR Survey Tool](http://www.unicode.org/cldr/survey_tool.html) is the main channel for collecting data, and bug/feature request are tracked in a database ([CLDR Bug Reports](http://www.unicode.org/cldr/filing_bug_reports.html)). - The final approval of the release of any version of CLDR is up to the decision of the CLDR Technical Committee. ## Formal Technical Committee Procedures @@ -22,8 +22,8 @@ For more information on the formal procedures for the Unicode CLDR Technical Com The [UTS #35: Locale Data Markup Language (LDML)](http://www.unicode.org/reports/tr35/) specification are kept up to date with each release with change/added structure for new data types or other features. -- Requests for changes are entered in the bug/feature request database ([CLDR Bug Reports](http://www.unicode.org/cldr/filing_bug_reports.html)). -- Structural changes are always backwards-compatible. That is, previous files will continue to work. Deprecated elements remain, although their usage is strongly discouraged. +- Requests for changes are entered in the bug/feature request database ([CLDR Bug Reports](http://www.unicode.org/cldr/filing_bug_reports.html)). +- Structural changes are always backwards-compatible. That is, previous files will continue to work. Deprecated elements remain, although their usage is strongly discouraged. - There is a standing policy for structural changes that require non-trivial code for proper implementation, such as time zone fallback or alias mechanisms. These require design discussions in the Technical Committee that demonstrates correct function according to the proposed specification. ## Data- Submission and Vetting @@ -32,15 +32,15 @@ The contributors of locale data are expected to be language speakers residing in There are two types of data in the repository: -- **Core data** (See [Core data for new locales](http://cldr.unicode.org/index/cldr-spec/minimaldata)): The content is collected from language experts typically with a CLDR Technical Committee member involvement, and is reviewed by the committee. This is required for a new language to be added in CLDR. See also [Exemplar Character Sources](http://www.unicode.org/cldr/filing_bug_reports.html#Exemplar_Characters). +- **Core data** (See [Core data for new locales](http://cldr.unicode.org/index/cldr-spec/minimaldata)): The content is collected from language experts typically with a CLDR Technical Committee member involvement, and is reviewed by the committee. This is required for a new language to be added in CLDR. See also [Exemplar Character Sources](http://www.unicode.org/cldr/filing_bug_reports.html#Exemplar_Characters). - **Common locale data**: This is the bulk of the CLDR data and data collection occurs twice a year using the Survey tool. (See [How to Contribute](http://cldr.unicode.org/#TOC-How-to-Contribute-).) - + The following 4 states are used to differentiate the data contribution levels. The initial data contributions are normally marked as draft; this may be changed once the data is vetted. -- Level 1: **unconfirmed** -- Level 2: **provisional** -- Level 3: **contributed (= minimally approved)** +- Level 1: **unconfirmed** +- Level 2: **provisional** +- Level 3: **contributed (= minimally approved)** - Level 4: **approved** (equivalent to an absent draft attribute) Implementations may choose the level at which they wish to accept data. They may choose to accept even **unconfirmed** data if having some data is better than no data for their purpose. Approved data are vetted by language speakers; however, this does not mean that the data is guaranteed to be error-free -- this is simply the best judgment of the vetters and the committee according to the process. @@ -61,33 +61,33 @@ There are multiple levels of access and control: These levels are decided by the technical committee and the TC representative for the respective organizations. -- Unicode TC members (full/institutional/supporting) can assign its users to Regular or Guest level, and with approval of the TC, users at the Expert level. -- TC Organizations that are fully engaged in the CLDR Technical Committee are given a higher vote level of 6 votes to reflect their level of expertise and coordination in the working of CLDR and the survey tool as compared to the normal organization vote level of 4 votes -- Liaison or associate members can assign to Guest, or to other levels with approval of the TC. - - The liaison/associate member him/herself gets TC status in order to manage users, but gets a Guest status in terms of voting, unless the committee approves a higher level. +- Unicode TC members (full/institutional/supporting) can assign its users to Regular or Guest level, and with approval of the TC, users at the Expert level. +- TC Organizations that are fully engaged in the CLDR Technical Committee are given a higher vote level of 6 votes to reflect their level of expertise and coordination in the working of CLDR and the survey tool as compared to the normal organization vote level of 4 votes +- Liaison or associate members can assign to Guest, or to other levels with approval of the TC. + - The liaison/associate member him/herself gets TC status in order to manage users, but gets a Guest status in terms of voting, unless the committee approves a higher level. - Users assigned to "[unicode.org](http://unicode.org/)" are normally assigned as Guest, but the committee can assign a different level. - + ### Voting Process -- Each user gets a vote on each value, but the strength of the vote varies according to the user level (see table above). -- For each value, each organization gets a vote based on the maximum (not cumulative) strength of the votes of its users who voted on that item. +- Each user gets a vote on each value, but the strength of the vote varies according to the user level (see table above). +- For each value, each organization gets a vote based on the maximum (not cumulative) strength of the votes of its users who voted on that item. - For example, if an organization has 10 Vetters for one locale, if the highest user level who voted has user level of 4 votes, then the vote count attributed to the organization as a whole is 4 for that item. ### Optimal Field Value For each release, there is one optimal field value determined by the following: -- Add up the votes for each value from each organization. -- Sort the possible alternative values for a given field - - by the most votes (descending) - - then by UCA order of the values (ascending) -- The first value is the optimal value (**O**). +- Add up the votes for each value from each organization. +- Sort the possible alternative values for a given field + - by the most votes (descending) + - then by UCA order of the values (ascending) +- The first value is the optimal value (**O**). - The second value (if any) is the next best value (**N**). ### Draft Status of Optimal Field Value 1. Let **O** be the optimal value's vote, **N** be the vote of the next best value (or zero if there is none), and G be the number of organizations that voted for the optimal value. Let **oldStatus** be the draft status of the previously released value. - + 2. Assign the draft status according to the first of the conditions below that applies: | **Resulting Draft Status** | **Condition** | @@ -96,27 +96,27 @@ For each release, there is one optimal field value determined by the following: | *contributed* | - O > N and O ≥ 4 and oldstatus < contributed
- O > N and O ≥ 2 and G ≥ 2 | | *provisional* | O ≥ N and O ≥ 2 | | *unconfirmed* | *otherwise* | - -1. *Established* locales are currently found in [coverageLevels.xml](https://github.com/unicode-org/cldr/blob/master/common/supplemental/coverageLevels.xml), with approvalRequirement\[@votes="8"\] + +1. *Established* locales are currently found in [coverageLevels.xml](https://github.com/unicode-org/cldr/blob/master/common/supplemental/coverageLevels.xml), with approvalRequirement\[@votes="8"\] - Some specific items have an even higher threshold. See approvalRequirement elements in [coverageLevels.xml](http://unicode.org/repos/cldr/trunk/common/supplemental/coverageLevels.xml) for details. -2. If the oldStatus is better than the new draft status, then no change is made. Otherwise, the optimal value and its draft status are made part of the new release. +2. If the oldStatus is better than the new draft status, then no change is made. Otherwise, the optimal value and its draft status are made part of the new release. - For example, if the new optimal value does not have the status of **approved**, and the previous release had an **approved** value (one that does not have an error and is not a fallback), then that previously-released value stays **approved** and replaces the optimal value in the following steps. - + It is difficult to develop a formulation that provides for stability, yet allows people to make needed changes. The CLDR committee welcomes suggestions for tuning this mechanism. Such suggestions can be made by filing a [new ticket](https://cldr.unicode.org/index/bug-reports#TOC-Filing-a-Ticket). ## Data- Resolution After the contribution of collecting and vetting data, the data needs to be refined free of errors for the release: -- Collisions errors are resolved by retaining one of the values and removing the other(s). -- The resolution choice is based on the judgment of the committee, typically according to which field is most commonly used. - - When an item is removed, an alternate may then become the new optimal value. +- Collisions errors are resolved by retaining one of the values and removing the other(s). +- The resolution choice is based on the judgment of the committee, typically according to which field is most commonly used. + - When an item is removed, an alternate may then become the new optimal value. - All values with errors are removed. -- Non-optimal values are handled as follows - - Those with no votes are removed. +- Non-optimal values are handled as follows + - Those with no votes are removed. - Those with votes are marked with *alt=proposed* and given the draft status: **unconfirmed** - + If a locale does not have minimal data (at least at a provisional level), then it may be excluded from the release. Where this is done, it may be restored to the repository for the next submission cycle. This process can be fine-tuned by the Technical Committee as needed, to resolve any problems that turn up. A committee decision can also override any of the above process for any specific values. @@ -124,20 +124,20 @@ This process can be fine-tuned by the Technical Committee as needed, to resolve For more information see the key links in [CLDR Survey Tool](http://www.unicode.org/cldr/survey_tool.html) (especially the Vetting Phase). **Notes:** -- If data has a formal problem, it can be fixed directly (in CVS) without going through the above process. Examples include: - - syntactic problems in pattern, extra trailing spaces, inconsistent decimals, mechanical sweeps to change attributes, translatable characters not quoted in patterns, changing ' (punctuation mark) to curly apostrophe or s-cedilla to s-comma-below, removing disallowed exemplar characters (non-letter, number, mark, uppercase when there is a lowercase). - - These are changed in-place, without changing the draft status. -- Linguistically-sensitive data should always go through the survey tool. Examples include: - - names of months, territories, number formats, changing ASCII apostrophe to U+02BC modifier letter apostrophe or U+02BB modifier letter turned comma, or U+02BD modifier letter reversed comma, adding/removing normal exemplar characters. -- The TC committee can authorize bulk submissions of new data directly (CVS), with all new data marked draft="unconfirmed" (or other status decided by the committee), but only where the data passes the CheckCLDR console tests. -- The survey tool does not currently handle all CLDR data. For data it doesn't cover, the regular bug system is used to submit new data or ask for revisions of this data. In particular: - - Collation, transforms, or text segmentation, which are more complex. - - For collation data, see the comparison charts at [http://www.unicode.org/cldr/comparison\_charts.html](http://www.unicode.org/cldr/comparison_charts.html) or the XML data at [http://unicode.org/cldr/data/common/collation/](http://unicode.org/cldr/data/common/collation/) - - For transforms, see the XML data at [http://unicode.org/cldr/data/common/transforms/](http://unicode.org/cldr/data/common/transforms/) - - Non-linguistic locale data: - - XML data: [http://unicode.org/cldr/data/common/supplemental/](http://unicode.org/cldr/data/common/supplemental/) +- If data has a formal problem, it can be fixed directly (in CVS) without going through the above process. Examples include: + - syntactic problems in pattern, extra trailing spaces, inconsistent decimals, mechanical sweeps to change attributes, translatable characters not quoted in patterns, changing ' (punctuation mark) to curly apostrophe or s-cedilla to s-comma-below, removing disallowed exemplar characters (non-letter, number, mark, uppercase when there is a lowercase). + - These are changed in-place, without changing the draft status. +- Linguistically-sensitive data should always go through the survey tool. Examples include: + - names of months, territories, number formats, changing ASCII apostrophe to U+02BC modifier letter apostrophe or U+02BB modifier letter turned comma, or U+02BD modifier letter reversed comma, adding/removing normal exemplar characters. +- The TC committee can authorize bulk submissions of new data directly (CVS), with all new data marked draft="unconfirmed" (or other status decided by the committee), but only where the data passes the CheckCLDR console tests. +- The survey tool does not currently handle all CLDR data. For data it doesn't cover, the regular bug system is used to submit new data or ask for revisions of this data. In particular: + - Collation, transforms, or text segmentation, which are more complex. + - For collation data, see the comparison charts at [http://www.unicode.org/cldr/comparison\_charts.html](http://www.unicode.org/cldr/comparison_charts.html) or the XML data at [http://unicode.org/cldr/data/common/collation/](http://unicode.org/cldr/data/common/collation/) + - For transforms, see the XML data at [http://unicode.org/cldr/data/common/transforms/](http://unicode.org/cldr/data/common/transforms/) + - Non-linguistic locale data: + - XML data: [http://unicode.org/cldr/data/common/supplemental/](http://unicode.org/cldr/data/common/supplemental/) - HTML view: [http://www.unicode.org/cldr/data/diff/supplemental/supplemental.html](http://www.unicode.org/cldr/data/diff/supplemental/supplemental.html) - + ### Prioritization @@ -200,7 +200,6 @@ There is an internal email list for the Unicode CLDR Technical Committee, open t The current Technical Committee Officers are: -- Chair: Mark Davis (Google) +- Chair: Mark Davis (Google) - Vice-Chair: Annemarie Apple (Google) -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/index/process/cldr-data-retention-policy.md b/docs/site/index/process/cldr-data-retention-policy.md index e18d9065c24..22c4520eac6 100644 --- a/docs/site/index/process/cldr-data-retention-policy.md +++ b/docs/site/index/process/cldr-data-retention-policy.md @@ -11,4 +11,3 @@ The following guidelines have been discussed by the CLDR technical committee and 1. Territory Names ( //ldml/localeDisplayNames/territories/territory\[@type\="XX"] ) \- Data is to remain in the CLDR for a period of 5 years after the territory code for territory "XX" is deprecated in the IANA Subtag Registry. 2. Metazone Names ( //ldml/dates/timeZoneNames/metazone\[@type\="ZoneName"] \- Data is to remain in the CLDR for a period of 20 years after the metazone becomes "inactive" ( i.e. The zone name is not used in ANY country ). A spreadsheet listing the Inactive Metazones in CLDR and the dates when they became inactive can be found [here](https://docs.google.com/spreadsheets/d/1Oj1IVo2Vg6wtAhk0Xd3HcA04HKZmSPxksIpvduvSYw8/edit#gid=0). -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/index/requesting-additionsupdates-to-cldr-languagepopulation-data.md b/docs/site/index/requesting-additionsupdates-to-cldr-languagepopulation-data.md index 0d21cf1940d..da4a88b7732 100644 --- a/docs/site/index/requesting-additionsupdates-to-cldr-languagepopulation-data.md +++ b/docs/site/index/requesting-additionsupdates-to-cldr-languagepopulation-data.md @@ -16,15 +16,15 @@ For CLDR purposes, the language data focus on the usefulness with computer inter Requests to add or change language/population data must provide the following basic information: -- language name -- 2 or 3-letter language code -- applicable country/region name -- applicable country/region code -- official status (and justification) -- language population in the region -- literacy in the language, where possible +- language name +- 2 or 3-letter language code +- applicable country/region name +- applicable country/region code +- official status (and justification) +- language population in the region +- literacy in the language, where possible - links to reliable sources for population/literacy data - + Reliable sources for population data and official status are required for population updates and additions. While [Ethnologue](https://www.google.com/url?q=https%3A%2F%2Fwww.ethnologue.com%2F&sa=D&sntz=1&usg=AOvVaw02Rajsyksb8nOu8MESVtKi) may be a good source for "mother tongue" or native speaker data for more common languages, it is not a sufficient source on its own for population data on most languages. Recent government or NGO-sponsored census data are typically better sources. @@ -42,4 +42,3 @@ http://unicode.org/cldr/trac/ticket/9609 http://unicode.org/cldr/trac/ticket/9601#comment:2 -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/index/survey-tool.md b/docs/site/index/survey-tool.md index 782d2dbce13..b911d0dfc92 100644 --- a/docs/site/index/survey-tool.md +++ b/docs/site/index/survey-tool.md @@ -34,4 +34,3 @@ To see a summary of the new fields that will be in the next version of CLDR, see For developers, see the [development pages](https://cldr.unicode.org/development/cldr-development-site). -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/index/survey-tool/bulk-data-upload.md b/docs/site/index/survey-tool/bulk-data-upload.md index 3cbfc5c8f7f..9bb6fdfcd07 100644 --- a/docs/site/index/survey-tool/bulk-data-upload.md +++ b/docs/site/index/survey-tool/bulk-data-upload.md @@ -18,14 +18,14 @@ Here are the instructions for a bulk upload (of an XML file in LDML format) to t 6. Click **Choose File** to pick the XML file for that locale on your locale disk 7. Click **Upload as my Submission/Vetting choices** 8. You will see a raw listing of lines in XML, and an error line if the file doesn't validate. - 1. If the file does not validate, fix the file, hit the back button, and go to Step 4. + 1. If the file does not validate, fix the file, hit the back button, and go to Step 4. 2. If the file does validate, you'll see a list of XML paths and values. 9. Click **Submit \**. 10. You will see a detailed list of the test results for the items you're submitting. - You can click on an item's path link (left hand side) to view that item in the surveytool - Any items with an error icon  will not be submitted. - If the message is "Item is not writable in the Survey Tool. Please file a ticket." then you will need to [file a ticket](https://cldr.unicode.org/index/bug-reports#TOC-Filing-a-Ticket) instead. These can be filed in a single ticket. Include all the paths and the respective values. - 1. Press "Really Submit As My Vote" to submit all passing items as your vote, or revise the file and start back at Step 4. + 1. Press "Really Submit As My Vote" to submit all passing items as your vote, or revise the file and start back at Step 4. ### Example XML: @@ -70,7 +70,7 @@ Here are the instructions for a bulk upload (of an XML file in LDML format) to t - + @@ -91,4 +91,3 @@ Note: the filename of the XML file doesn't matter ![image](../../images/index/bulkDataUpload0.png) -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/index/survey-tool/coverage.md b/docs/site/index/survey-tool/coverage.md index 4f5b4a8f2b9..539f7eda7cc 100644 --- a/docs/site/index/survey-tool/coverage.md +++ b/docs/site/index/survey-tool/coverage.md @@ -41,4 +41,3 @@ As part of CLDR v35, the coverage level has been refined further. See ticket \#[ 1. *Basic Level* : Cebuano (ceb), Hausa (ha), Igbo (ig), Yoruba (yo) 2. *Modern Level*: Somali (so), Javanese (jv) -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/index/survey-tool/faq-and-known-bugs.md b/docs/site/index/survey-tool/faq-and-known-bugs.md index a64956a3471..1c78b8205b8 100644 --- a/docs/site/index/survey-tool/faq-and-known-bugs.md +++ b/docs/site/index/survey-tool/faq-and-known-bugs.md @@ -62,11 +62,10 @@ If you have further questions, or problems with the Survey Tool, send a message ## Known Bugs, Issues, Restrictions -The following are general known bugs and issues. For known issues in the current release, see [Translation Guidelines](https://cldr.unicode.org/translation). +The following are general known bugs and issues. For known issues in the current release, see [Translation Guidelines](https://cldr.unicode.org/translation). 1. The description of bulk uploading (http://cldr.unicode.org/index/survey-tool/upload) has not yet been updated for the new UI. 2. The description of managing users (http://cldr.unicode.org/index/survey-tool/managing-users) has not yet been updated for the new UI. If you find additional problems, please [file a ticket](http://unicode.org/cldr/trac/newticket). -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/index/survey-tool/managing-users.md b/docs/site/index/survey-tool/managing-users.md index 2d216b0f785..35534e632a5 100644 --- a/docs/site/index/survey-tool/managing-users.md +++ b/docs/site/index/survey-tool/managing-users.md @@ -124,4 +124,3 @@ Later on, in the vetting phase, you can send messages with the outstanding dispu 2. OR (unusual case) under languages in http://www.ietf.org/internet\-drafts/draft\-ietf\-ltru\-4645bis\-04\.txt 3. However, watch for mistakes, eg someone using "tw" to mean Taiwanese, when it actually means "Twi" -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/index/survey-tool/survey-tool-accounts.md b/docs/site/index/survey-tool/survey-tool-accounts.md index 6487f4459f7..95c9495cd41 100644 --- a/docs/site/index/survey-tool/survey-tool-accounts.md +++ b/docs/site/index/survey-tool/survey-tool-accounts.md @@ -23,4 +23,3 @@ When you request an account, you must list all of the locale(s) that you would l Once you have an account, open the [Unicode CLDR Survey Tool](http://unicode.org/cldr/apps/survey), click the 'Login' button and fill in your email address and password. -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/stable-links-info.md b/docs/site/stable-links-info.md index cc44b16c57b..6eec85381cf 100644 --- a/docs/site/stable-links-info.md +++ b/docs/site/stable-links-info.md @@ -37,4 +37,3 @@ This page contains information about updating the various stable links to part o | ~~development version~~ | ~~http://unicode.org/cldr/dev~~ | ~~http://unicode.org/repos/cldr/trunk~~ | ~~nothing needed~~ | | ~~latest version~~ | ~~http://unicode.org/cldr/latest~~ | ~~http://unicode.org/repos/cldr/tags/latest~~ | ~~Need to " svn delete latest " and then " svn copy 24 latest "~~ | -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/translation.md b/docs/site/translation.md index b58091e458a..b0c32429dfa 100644 --- a/docs/site/translation.md +++ b/docs/site/translation.md @@ -10,26 +10,26 @@ During Submission, please read the CLDR Training (if new to the survey tool), pl ### Prerequisites -1. If you're **new to CLDR**, take the CLDR training below. -2. If you're already **experienced with CLDR**, read the Critical reminders section (mandatory). -3. Review the **Status and Schedule, New Areas, Survey Tool**, and **Known Issues**. +1. If you're **new to CLDR**, take the CLDR training below. +2. If you're already **experienced with CLDR**, read the Critical reminders section (mandatory). +3. Review the **Status and Schedule, New Areas, Survey Tool**, and **Known Issues**. 4. Once you are ready, go to the [**Survey Tool**](https://st.unicode.org/cldr-apps/) and log in. - + ### Updates -- There are substantial changes to the guidance on dealing with Emoji. Be sure to read the updates in **New Areas**! +- There are substantial changes to the guidance on dealing with Emoji. Be sure to read the updates in **New Areas**! - The Survey Tool Approval column has new icons in the Approval column; see **Survey Tool**. - + When a section below changes, the date will be in the header. ## Status and Schedule The Survey Tool is now open for [Submission](https://cldr.unicode.org/translation/getting-started/survey-tool-phases#h.wqmb27e55b4l) until the start of [Vetting](https://cldr.unicode.org/translation/getting-started/survey-tool-phases#h.ddjb4w32ki37) on June 12th ([schedule](https://docs.google.com/spreadsheets/d/1N6inI5R84UoYlRwuCNPBOAP7ri4q2CmJmh8DC5g-S6c/edit#gid=1680747936)); then the [Vetting phase](https://cldr.unicode.org/translation/getting-started/survey-tool-phases#h.ddjb4w32ki37) lasts until **June 30**. -- **Disconnect error**. If you see a persistent Loading error with a disconnect message or other odd behavior, please [empty your cache](https://cldr.unicode.org/translation). -- Survey Tool email notification may be going to your spam folder. Check your spam folder regularly. +- **Disconnect error**. If you see a persistent Loading error with a disconnect message or other odd behavior, please [empty your cache](https://cldr.unicode.org/translation). +- Survey Tool email notification may be going to your spam folder. Check your spam folder regularly. - "**Same as code**" errors - when translating codes for items such as languages, regions, scripts, and keys, it is normally an error to select the code itself as the translated name. If the error appears under Typography, you can ignore it. \[[CLDR-13552](https://unicode-org.atlassian.net/browse/CLDR-13552)\] - + ## New Areas (2024-05-30) ![alt-text](./images/translation-horizontal-emojis.png) @@ -42,63 +42,63 @@ Seven new emoji have been added (images above). These will be released in Unicod ### Emoji search keywords -1. **Important Notes** +1. **Important Notes** 1. **The Additions from WhatsApp are not listed as Missing in the Dashboard.** - 1. **They are listed instead under the Abstained label, and show up with ☑️ in the main window in the A column.** + 1. **They are listed instead under the Abstained label, and show up with ☑️ in the main window in the A column.** 2. **So be sure the Abstained label is checked.** - 3. **If you have too many Abstained items to deal with, handle the emoji first.** + 3. **If you have too many Abstained items to deal with, handle the emoji first.** 2. **The usage model is:** - 1. The user types one or more words in an emoji search field. + 1. The user types one or more words in an emoji search field. 2. Each word successively narrows a number of emoji in a results box. - - heart → 🥰 😘 😻 💌 💘 💝 💖 💗 💓 💞 💕 💟 ❣️ 💔 ❤️‍🔥 ❤️‍🩹 ❤️ 🩷 🧡 💛 💚 💙 🩵 💜 🤎 🖤 🩶 🤍 💋 🫰 🫶 🫀 💏 💑 🏠 🏡 ♥️ 🩺 - - Blue → 🥶 😰 💙 🩵 🫐 👕 👖 📘 🧿 🔵 🟦 🔷 🔹 🏳️‍⚧️ - - heart blue → 💙 🩵 - 3. A word with no hits is ignored - - \[heart | blue | confabulation\] is equivalent to \[heart | blue\] - 4. As the user types a word, each character added to the word narrows the results. - 5. Whenever the list is short enough to scan, the user will mouse-click on the right emoji - so it doesn't have to be narrowed too far. - - In the following, the user would just click on 🎉 if that works for them. - 1. celebrate → 🥳 🥂 🎈 🎉 🎊 🪅 - 6. The order of words doesn't matter; nor does upper- versus lowercase. - 3. **The limits on the number of keywords per emoji have been relaxed** in the beginning, but will be decreased to the final limit (20) soon. So please work on reducing duplicates and breaking up multi-word search keywords. - 4. **Don't follow the English emoji names and keywords literally**; they are *just* for comparison. The names and keywords should reflect **your** cultural associations with the emoji images, and should match what **users of your language** are most likely to search in order to find emoji. - 1. English phrases like "give up" = surrender are often translated as single words in other languages. *Don't just translate each word!* For example, in \[**hold** |… | **shut** |… | **tongue** |… | **up** |… | **your**\], the corresponding phrases are "shut up" and "hold your tongue". + - heart → 🥰 😘 😻 💌 💘 💝 💖 💗 💓 💞 💕 💟 ❣️ 💔 ❤️‍🔥 ❤️‍🩹 ❤️ 🩷 🧡 💛 💚 💙 🩵 💜 🤎 🖤 🩶 🤍 💋 🫰 🫶 🫀 💏 💑 🏠 🏡 ♥️ 🩺 + - Blue → 🥶 😰 💙 🩵 🫐 👕 👖 📘 🧿 🔵 🟦 🔷 🔹 🏳️‍⚧️ + - heart blue → 💙 🩵 + 3. A word with no hits is ignored + - \[heart | blue | confabulation\] is equivalent to \[heart | blue\] + 4. As the user types a word, each character added to the word narrows the results. + 5. Whenever the list is short enough to scan, the user will mouse-click on the right emoji - so it doesn't have to be narrowed too far. + - In the following, the user would just click on 🎉 if that works for them. + 1. celebrate → 🥳 🥂 🎈 🎉 🎊 🪅 + 6. The order of words doesn't matter; nor does upper- versus lowercase. + 3. **The limits on the number of keywords per emoji have been relaxed** in the beginning, but will be decreased to the final limit (20) soon. So please work on reducing duplicates and breaking up multi-word search keywords. + 4. **Don't follow the English emoji names and keywords literally**; they are *just* for comparison. The names and keywords should reflect **your** cultural associations with the emoji images, and should match what **users of your language** are most likely to search in order to find emoji. + 1. English phrases like "give up" = surrender are often translated as single words in other languages. *Don't just translate each word!* For example, in \[**hold** |… | **shut** |… | **tongue** |… | **up** |… | **your**\], the corresponding phrases are "shut up" and "hold your tongue". 2. **Steps** 1. **Break up multi-word keywords (see the usage model). For example,** - 1. Where *white flag* (🏳️) has \[white waving flag | white flag\] , it is better to replace that with \[white | waving | flag\]. - 2. Because of the usage model, this works far better. - 3. Reduce or remove "[stopwords](https://www.opinosis-analytics.com/knowledge-base/stop-words-explained/)", except with close associations, such as \[down\] with *thumbs down* (👎) + 1. Where *white flag* (🏳️) has \[white waving flag | white flag\] , it is better to replace that with \[white | waving | flag\]. + 2. Because of the usage model, this works far better. + 3. Reduce or remove "[stopwords](https://www.opinosis-analytics.com/knowledge-base/stop-words-explained/)", except with close associations, such as \[down\] with *thumbs down* (👎) 2. **Reduce duplicates (and uncommon synonyms) in meaning. For example,** - 1. If you see \[jump | jumping | bounding | leaping | prancing\], it is better to replace that with just \[jump\] unless you are confident people will frequently use the other forms. + 1. If you see \[jump | jumping | bounding | leaping | prancing\], it is better to replace that with just \[jump\] unless you are confident people will frequently use the other forms. 2. Because each character narrows the results, \[jumping\] is not necessary if you have \[jump\]. - - Favor the prefixes: \[jump\] is better than \[jumping\] - - Keep forms where one character word is not the prefix of another, eg \[race | racing\] and \[ride | riding\] + - Favor the prefixes: \[jump\] is better than \[jumping\] + - Keep forms where one character word is not the prefix of another, eg \[race | racing\] and \[ride | riding\] 3. **Add equivalents among gender alternates. For example,** - 1. If a *man scientist* (👨‍🔬) has \[researcher\], add the equivalent to both *women scientist* (👩‍🔬) and *scientist* (🧑‍🔬). - 2. Those equivalents may have different forms in your language, depending on the gender. For example, Forscher (man) vs Forscherin (woman) in German. + 1. If a *man scientist* (👨‍🔬) has \[researcher\], add the equivalent to both *women scientist* (👩‍🔬) and *scientist* (🧑‍🔬). + 2. Those equivalents may have different forms in your language, depending on the gender. For example, Forscher (man) vs Forscherin (woman) in German. 4. **Avoid:** 1. Names of specific people or places except for close associations, such as \[Japan | Japanese \] with *map of Japan* (🗾) or *sushi* (🍣). - - Fictional characters or places are ok, if first used before 1855. + - Fictional characters or places are ok, if first used before 1855. - Certain other names have been verified to be in the public domain (Pinocchio, Dracula). - - Don't add others (post-1855) without verifying with the TC. - 2. Intellectual Property (IP), such as trademarks or names of products, companies, books or movies + - Don't add others (post-1855) without verifying with the TC. + 2. Intellectual Property (IP), such as trademarks or names of products, companies, books or movies 3. Religious references, except for close associations, such as \[Christian | church | chapel\] with *church* ( ⛪), \[cherub | church\] with *baby angel* (👼), \[islam | Muslim | ramadan\] with *star and crescent* (☪️) - 4. Specific terms for sexuality, unless strongly associated with the emoji, eg \[lgbt|lgbtq |... \] for *rainbow* (🌈), *rainbow flag* (🏳️‍🌈), and *transgender flag* (🏳️‍⚧️). + 4. Specific terms for sexuality, unless strongly associated with the emoji, eg \[lgbt|lgbtq |... \] for *rainbow* (🌈), *rainbow flag* (🏳️‍🌈), and *transgender flag* (🏳️‍⚧️). 5. **Note:** The English values have also been reviewed and modified for these rules. - + ### New/expanded units -1. Additional units: - 1. **night**, as in "your hotel reservation is for **3 nights**". - 2. **light-speed**, a special unit used in combination with a duration, such as "[light-second](https://en.wikipedia.org/wiki/Light-second)". Because of that limited usage, typically the "-speed" suffix is dropped, and the "light" typically doesn't change for inflections (incl. plurals) - *but this may vary by language.* - 3. **portion-per-1e9**, which will normally be translated as something like [parts per billion](https://en.wikipedia.org/wiki/Parts-per_notation). +1. Additional units: + 1. **night**, as in "your hotel reservation is for **3 nights**". + 2. **light-speed**, a special unit used in combination with a duration, such as "[light-second](https://en.wikipedia.org/wiki/Light-second)". Because of that limited usage, typically the "-speed" suffix is dropped, and the "light" typically doesn't change for inflections (incl. plurals) - *but this may vary by language.* + 3. **portion-per-1e9**, which will normally be translated as something like [parts per billion](https://en.wikipedia.org/wiki/Parts-per_notation). 2. Additional grammatical forms have been added for a few units. - 1. point - meaning the [typographical measurement](https://en.wikipedia.org/wiki/Point_%28typography%29). - 2. milligram-ofglucose-per-deciliter - used for blood sugar measurement - 3. millimeter-ofhg - used for pressure measurements + 1. point - meaning the [typographical measurement](https://en.wikipedia.org/wiki/Point_%28typography%29). + 2. milligram-ofglucose-per-deciliter - used for blood sugar measurement + 3. millimeter-ofhg - used for pressure measurements 4. Beaufort - used for [wind speed](https://en.wikipedia.org/wiki/Beaufort_scale) (only in certain countries) - + ### Language names @@ -114,17 +114,17 @@ Once trained and up to speed on Critical reminders (above), log in to the [Surve ### Survey Tool Changes -1. There has been substantial performance work that will show up for the first time. If there are performance issues, please file a ticket with a row URL and an explanation for what happened. -2. In the Dashboard, you can filter the messages instead of jumping to the first one. In the Dashboard header, each notification category (such as "Missing" or "Abstained") has a checkbox determining whether it is shown or hidden. +1. There has been substantial performance work that will show up for the first time. If there are performance issues, please file a ticket with a row URL and an explanation for what happened. +2. In the Dashboard, you can filter the messages instead of jumping to the first one. In the Dashboard header, each notification category (such as "Missing" or "Abstained") has a checkbox determining whether it is shown or hidden. 3. In each row of the vetting page, there is now a visible icon when there are forum messages at the right side of the English column: - 1. 👁️‍🗨️ if there are any open posts - 2. 💬 if there are posts, but all are closed + 1. 👁️‍🗨️ if there are any open posts + 2. 💬 if there are posts, but all are closed 4. For Units and a few other sections, the Pages have changed to reduce the size on the page to improve performance. 1. Pages may be split, and/or retitled - 2. Rows may move to a different page. + 2. Rows may move to a different page. 5. In the Dashboard, the Abstains items will now only have one entry per page. You can use that entry to go to its page, and then fix Abstains on that page. Once you are done on that page, hit the Dashboard refresh button (↺). This fixes a performance problem for people with a large number of Abstains, and reduces clutter in the Dashboard. 6. The symbols in the A column have been changed to be searchable in browsers (with *Find in Page*) and stand out - + more on the page. See below for a table. They override the symbols in [Survey Tool Guide: Icons](https://cldr.unicode.org/translation/getting-started/guide#h.fbq7vldvjuz4). @@ -147,33 +147,33 @@ more on the page. See below for a table. They override the symbols in [Survey To This list will be updated as fixes are made available in Survey Tool Production. If you find a problem, please [file a ticket](https://github.com/unicode-org/cldr/blob/main/docs/requesting_changes.md), but please review this list first to avoid creating duplicate tickets. -1. [CLDR-17694](https://unicode-org.atlassian.net/browse/CLDR-17694) - Back button in browser fails in forum under certain conditions -2. [CLDR-17693](https://unicode-org.atlassian.net/browse/CLDR-17693) SurveyTool fatal in getDBConnection -3. [CLDR-17658](https://unicode-org.atlassian.net/browse/CLDR-17658) - Dashboard slowness -4. Images for the plain symbols. Non-emoji such as [€](https://cldr-smoke.unicode.org/smoketest/v#/fr/Symbols2/47925556fd2904b5), √, », ¹, §, ... do not have images in the info pane. \[[CLDR-13477](https://unicode-org.atlassian.net/browse/CLDR-13477)\]**Workaround**: Look at the Code column; unlike the new emoji, your browser should display them there. +1. [CLDR-17694](https://unicode-org.atlassian.net/browse/CLDR-17694) - Back button in browser fails in forum under certain conditions +2. [CLDR-17693](https://unicode-org.atlassian.net/browse/CLDR-17693) SurveyTool fatal in getDBConnection +3. [CLDR-17658](https://unicode-org.atlassian.net/browse/CLDR-17658) - Dashboard slowness +4. Images for the plain symbols. Non-emoji such as [€](https://cldr-smoke.unicode.org/smoketest/v#/fr/Symbols2/47925556fd2904b5), √, », ¹, §, ... do not have images in the info pane. \[[CLDR-13477](https://unicode-org.atlassian.net/browse/CLDR-13477)\]**Workaround**: Look at the Code column; unlike the new emoji, your browser should display them there. 5. [CLDR-17683](https://unicode-org.atlassian.net/browse/CLDR-17683) - Some items are not able to be flagged for TC review. This is being investigated.Meanwhile, Please enter forum posts meanwhile with any comments. - + ## Resolved Issues 1. [CLDR-17465](https://unicode-org.atlassian.net/browse/CLDR-17465) - dashboard download fails 2. [CLDR-17671](https://unicode-org.atlassian.net/browse/CLDR-17671) - survey tool search fails 3. [CLDR-17652](https://unicode-org.atlassian.net/browse/CLDR-17652) - Manual import of votes fails - + ## Recent Changes 1. [*CLDR-17658*](https://unicode-org.atlassian.net/browse/CLDR-17658) - In the Dashboard, the Abstains items will only have one entry per page. You can use that entry to go to its page, and then fix Abstains on that page. Once you are done on that page, hit the Dashboard refresh button (↺). This fixes a performance problem for people with a large number of Abstains, and reduces clutter in the Dashboard. - + ## CLDR training (for new linguists) Before getting started to contribute data in CLDR, and jumping in to using the Survey Tool, it is important that you understand the CLDR process & take the CLDR training. It takes about 2-3 hours to complete the training. 1. **Understand the basics about the CLDR process** read the [Survey Tool Guide](https://cldr.unicode.org/translation/getting-started/guide) and an overview of the [Survey Tool Stages](https://cldr.unicode.org/translation/getting-started/survey-tool-phases). - - New: A [video is available](https://www.youtube.com/watch?v=Wxs0TZl7Ljk) which shows how to login and begin contributing data for your locale. -2. **Read the Getting Started topics** on the Information Hub: - - General translation guide - - [Capitalization](https://cldr.unicode.org/translation/translation-guide-general/capitalization) + - New: A [video is available](https://www.youtube.com/watch?v=Wxs0TZl7Ljk) which shows how to login and begin contributing data for your locale. +2. **Read the Getting Started topics** on the Information Hub: + - General translation guide + - [Capitalization](https://cldr.unicode.org/translation/translation-guide-general/capitalization) - [Default Content](https://cldr.unicode.org/translation/translation-guide-general/default-content) - [References](https://cldr.unicode.org/translation/translation-guide-general/references) - [Handling Errors and Warnings](https://cldr.unicode.org/translation/getting-started/errors-and-warnings) @@ -198,13 +198,12 @@ You're already familiar with the CLDR process, but do keep the following in mind 4. **Avoid voting for English** - for items that do not work in your language, don't simply use English. Find a solution that works for your language. For example, if your language doesn't have a concept of calendar "quarters", use a translation that describes the concept "three-month period" rather than "quarter-of-a-year". 5. **Watch out for complex sections** and read the instructions carefully if in doubt: 1. [Date & Time](https://cldr.unicode.org/translation/date-time/date-time-names) - - [Names](https://cldr.unicode.org/translation/date-time/date-time-names) - - [Patterns](https://cldr.unicode.org/translation/date-time) - - [Symbols](https://cldr.unicode.org/translation/date-time/date-time-symbols) - 2. [Time zones](https://cldr.unicode.org/translation/time-zones-and-city-names) + - [Names](https://cldr.unicode.org/translation/date-time/date-time-names) + - [Patterns](https://cldr.unicode.org/translation/date-time) + - [Symbols](https://cldr.unicode.org/translation/date-time/date-time-symbols) + 2. [Time zones](https://cldr.unicode.org/translation/time-zones-and-city-names) 3. [Plural forms](https://cldr.unicode.org/translation/getting-started/plurals) - + *Tip: The links in the [Info Panel](https://cldr.unicode.org/translation/getting-started/guide#h.2jch1980f8sy) will point you to relevant instructions for the entry you're editing/vetting. Use it if in doubt.* -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/translation/characters.md b/docs/site/translation/characters.md index 51125032235..20714b589c1 100644 --- a/docs/site/translation/characters.md +++ b/docs/site/translation/characters.md @@ -1,6 +1,6 @@ --- title: Characters ---- +--- # Characters @@ -10,4 +10,3 @@ Characters category in the Survey tool include data that surrounds support for E - [Emoji Names and Keywords](https://cldr.unicode.org/translation/characters/short-names-and-keywords) - [Typographic Names](https://cldr.unicode.org/translation/characters/typographic-names) -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/translation/characters/character-labels.md b/docs/site/translation/characters/character-labels.md index 353619ad40f..294f9cd6540 100644 --- a/docs/site/translation/characters/character-labels.md +++ b/docs/site/translation/characters/character-labels.md @@ -10,4 +10,3 @@ CLDR has different types of character labels. - **Annotations** are used for more specific features of characters, such as “cactus”. Annotations do not need to be unique. They can be used in predictive typing, such as when typing “p i z” shows🍕 in a suggestion box. - **TTS Labels** are used for Text-to-Speech support, where a character is read aloud. They are typically a shortened and sometimes reworded version of the formal Unicode name. They may be combined with a Category Label for disambiguation. The names may not be unique, although they should be unique within a category. -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/translation/characters/short-names-and-keywords.md b/docs/site/translation/characters/short-names-and-keywords.md index 75ad052993a..6192a0acbfb 100644 --- a/docs/site/translation/characters/short-names-and-keywords.md +++ b/docs/site/translation/characters/short-names-and-keywords.md @@ -4,7 +4,7 @@ title: Emoji Names and Keywords # Emoji Names and Keywords -CLDR collects short character names and keywords for Emoji characters and sequences. +CLDR collects short character names and keywords for Emoji characters and sequences. These are found in Survey Tool under **Characters**, and they are divided into different category types. For example, Smileys, People, Animal & Nature, etc... @@ -47,7 +47,7 @@ Many of the emoji names are constructed, which means that in implementations emo 2. Hover over the ⓔ to see how some sample constructed emoji would look in English. ![image](../../images/Screenshot-2024-06-21-at-7.38.34.png) 3. **Characters\Category** contain terms like “flag” (used in constructing flag names). These 3 terms are also marked with ⓔ, so make sure to review each of the examples in English and your language. -4. **Blond/Bearded.** The people with blond hair or beards need to have names consistent with those used for hair styles (see [dark skin tone examples](https://cldr-smoke.unicode.org/cldr-apps/v#/USER/Component/4da6f737d7901c30)), such as: +4. **Blond/Bearded.** The people with blond hair or beards need to have names consistent with those used for hair styles (see [dark skin tone examples](https://cldr-smoke.unicode.org/cldr-apps/v#/USER/Component/4da6f737d7901c30)), such as: 1. [🧔 — man: beard](https://cldr-smoke.unicode.org/cldr-apps/v#/fr/People/20a49c6ad428d880) 2. [👱 — person: blond hair](https://cldr-smoke.unicode.org/cldr-apps/v#/fr/People/5cae8a8d1de49cd9) 3. [👱‍♂️ — man: blond hair](https://cldr-smoke.unicode.org/cldr-apps/v#/fr/People/532f430d6e2a26f) @@ -121,7 +121,7 @@ Other common problem cases that must be distinguished. **NOTE that punctuation ### Gender -There are different ways emoji may have gender. +There are different ways emoji may have gender. - No specific gender - smilies or human-form emoji where the gender is hidden, such as person fencing. @@ -136,7 +136,7 @@ For the full triples, we need three unique names: - X2 (=female only; no males) - X3 (=either male or female) -In some languages it may be tricky to do this, especially for the neutral case. +In some languages it may be tricky to do this, especially for the neutral case. Gender-neutral forms @@ -204,4 +204,3 @@ A: For some animals, there are two different emoji, one of which has a name incl For other animals, there is no such distinction. For example, there is only one wolf: 🐺 U+1F43A. In that case, you don't need to use a term corresponding to “face” in your language, even if the English name has the word face (that is often due to historical accident.) -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/translation/characters/typographic-names.md b/docs/site/translation/characters/typographic-names.md index bc50609c7e3..adfdf738171 100644 --- a/docs/site/translation/characters/typographic-names.md +++ b/docs/site/translation/characters/typographic-names.md @@ -13,7 +13,7 @@ CLDR maintains typographic terms for apps like word processors, graphic design a A quick web search or Wikipedia lookup will usually not find the correct terms. Most native speakers do not know the correct terminology unless they work in the graphics industry. For some languages, there’s also special-interest sites on the web that care about correct terminology; for example, [typolexikon.de](http://www.typolexikon.de/) in German. -The most common problem is giving the same name to two *different* fields. For example, you must *not* give the same name to [wght-400 (English=“regular”)](http://st.unicode.org/cldr-apps/v#/de/Typography/147d124e18ef76e9) and [wdth-100 (English=“normal”)](http://st.unicode.org/cldr-apps/v#/de/Typography/29a3de4cf27e33c6). +The most common problem is giving the same name to two *different* fields. For example, you must *not* give the same name to [wght-400 (English=“regular”)](http://st.unicode.org/cldr-apps/v#/de/Typography/147d124e18ef76e9) and [wdth-100 (English=“normal”)](http://st.unicode.org/cldr-apps/v#/de/Typography/29a3de4cf27e33c6). **However, there is an important exception for Feature fields that have a suffix after a number ("-heavy"), such as** [**wght-900-heavy (English=“heavy”)**](http://st.unicode.org/cldr-apps/v#/de/Typography/292fe4e98aa53cfe)**. You can** ***(and often should)*** **give the same name in your language to these as you give to the Code without a suffix.** @@ -71,4 +71,3 @@ This style is called **reverse oblique, reverse slanted,** or **back slanted.** The **Optical Size axis** is used to adjust letterforms to different text size, from fine print to large display type. See [here](https://en.wikipedia.org/wiki/Font#Optical_size), [here](https://docs.microsoft.com/en-us/typography/opentype/spec/dvaraxistag_opsz), or [here](http://wwwimages.adobe.com/www.adobe.com/content/dam/acom/en/products/type/pdfs/ArnoPro.pdf) (page 11) for a nice description. -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/translation/core-data.md b/docs/site/translation/core-data.md index d32f2858fc4..ff5cd0e3998 100644 --- a/docs/site/translation/core-data.md +++ b/docs/site/translation/core-data.md @@ -10,4 +10,3 @@ These pages describe certain core data needed by CLDR. - [Exemplar Characters](https://cldr.unicode.org/translation/core-data/exemplars) - [Numbering Systems](https://cldr.unicode.org/translation/core-data/numbering-systems) -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/translation/core-data/characters.md b/docs/site/translation/core-data/characters.md index 50ab50f18df..71f8e824fad 100644 --- a/docs/site/translation/core-data/characters.md +++ b/docs/site/translation/core-data/characters.md @@ -6,7 +6,7 @@ title: Alphabetic Information ## Ellipsis Patterns -Ellipsis patterns are used in a display when the text is too long to be shown. It will be used in environments where there is very little space, so it should be just one character; where that really can't work, it should be as short as possible. +Ellipsis patterns are used in a display when the text is too long to be shown. It will be used in environments where there is very little space, so it should be just one character; where that really can't work, it should be as short as possible. There are three different possible patterns that need to be translated. Typically the same character is used in all three, but three choices are provided just in case different characters would be appropriate in different contexts, for some languages. @@ -75,4 +75,3 @@ There are special versions of "Yes" and "No" used in POSIX (Portable Operating S | No | no:n | -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/translation/core-data/exemplars.md b/docs/site/translation/core-data/exemplars.md index da3e9196651..0348425e645 100644 --- a/docs/site/translation/core-data/exemplars.md +++ b/docs/site/translation/core-data/exemplars.md @@ -68,7 +68,7 @@ The very last line shows an internal UnicodeSet format. You can normally ignore ## Exemplar Characters -The exemplar character sets contain the commonly used letters for a given modern form of a language. These are used for testing and for determining the appropriate repertoire of letters for various tasks, like choosing charset converters that can handle a given language. The term “letter” is interpreted broadly, and includes characters used to form words, such as 是 or 가. It should not include presentation forms, like [U+FE90](https://util.unicode.org/UnicodeJsps/character.jsp?a=FE90) ( ‎ﺐ‎ ) ARABIC LETTER BEH FINAL FORM, or isolated Jamo characters (for Hangul). +The exemplar character sets contain the commonly used letters for a given modern form of a language. These are used for testing and for determining the appropriate repertoire of letters for various tasks, like choosing charset converters that can handle a given language. The term “letter” is interpreted broadly, and includes characters used to form words, such as 是 or 가. It should not include presentation forms, like [U+FE90](https://util.unicode.org/UnicodeJsps/character.jsp?a=FE90) ( ‎ﺐ‎ ) ARABIC LETTER BEH FINAL FORM, or isolated Jamo characters (for Hangul). - For charts of the standard (non-CJK) exemplar characters, see a chart of the [standard exemplar characters](https://www.unicode.org/cldr/charts/45/by_type/core_data.alphabetic_information.main.html). - For more information, please see [Section 5.6 Character Elements](http://unicode.org/reports/tr35/tr35-6.html#Character_Elements) in UTS#35: Locale Data Markup Language (LDML). @@ -84,7 +84,7 @@ There are different categories: ## Parse Characters -These are sets of characters that are treated as equivalent in parsing. In the Code column you'll see a description of the characters with a sample in parentheses. For example, the following indicates that in date/time parsing, when someone types any of the characters in the Winning column, they should be treated as equivalent to ":". +These are sets of characters that are treated as equivalent in parsing. In the Code column you'll see a description of the characters with a sample in parentheses. For example, the following indicates that in date/time parsing, when someone types any of the characters in the Winning column, they should be treated as equivalent to ":". Note that if your language doesn't use any of these characters in date and times, the value doesn't really matter, and you can simply vote for the default value. For example, if a time is represented by "3.20" instead of "3:20", then it doesn't matter which characters are equivalent to ":". @@ -117,4 +117,3 @@ Three possible solutions: The **standard** characters shouldn't contain punctuation. They also should not contain symbols, unless those symbols are only used with the language's writing system (aka script). For example, the **standard** Bengali currency symbols should contain the Bengali Rupee mark (which is Bengali-only), but should not include the $ Dollar Sign (which is common across all scripts). -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/translation/core-data/numbering-systems.md b/docs/site/translation/core-data/numbering-systems.md index bff1d442014..9f1166c44a1 100644 --- a/docs/site/translation/core-data/numbering-systems.md +++ b/docs/site/translation/core-data/numbering-systems.md @@ -41,4 +41,3 @@ Indicate "1" for grouping separator starting at 4 digit-numbers (i.e. 1,000 and Note that this is just the default, and the grouping separator may be retained in lists, or removed in other circumstances. For example, in English the "," is used by default, but not in addresses ("12345 Baker Street"), in 4-digit years (2014, but 12,000 BC), and certain other cases. -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/translation/currency-names-and-symbols.md b/docs/site/translation/currency-names-and-symbols.md index b3da22bce38..a6d6fca5cd4 100644 --- a/docs/site/translation/currency-names-and-symbols.md +++ b/docs/site/translation/currency-names-and-symbols.md @@ -6,7 +6,7 @@ title: Currency Names and Symbols ## Currency Names must be Unique -Currency names must be unique; the same name can't be used for two different currency codes. +Currency names must be unique; the same name can't be used for two different currency codes. When a country replaces a currency: @@ -42,4 +42,3 @@ The following general guidelines are used for currency symbols. These guidelines These are only general guidelines, and may need to be overridden in particular cases. Certain symbols like the dollar sign are particularly tricky, because they are used by a great many countries. -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/translation/currency-names-and-symbols/currency-names.md b/docs/site/translation/currency-names-and-symbols/currency-names.md index b12331f9373..439ba8afc12 100644 --- a/docs/site/translation/currency-names-and-symbols/currency-names.md +++ b/docs/site/translation/currency-names-and-symbols/currency-names.md @@ -24,4 +24,3 @@ Note: in some cases, the English currency symbol may appear as box, typically be | ![image](../../images/currency-names-and-symbols/u20BD.png) | U+20BD Ruble symbol (Russia...) | Unicode 7, June 2014 | | ![image](../../images/currency-names-and-symbols/u20BE.png) | U+20BE Lari symbol (Georgia) | Unicode 8, June 2015 | -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/translation/currency-names-and-symbols/special-cases.md b/docs/site/translation/currency-names-and-symbols/special-cases.md index 8a5595f71ee..b4e2a26b63a 100644 --- a/docs/site/translation/currency-names-and-symbols/special-cases.md +++ b/docs/site/translation/currency-names-and-symbols/special-cases.md @@ -15,7 +15,7 @@ Thus, this old currency symbol for PTE is specified as "PTE" in pt\_CV to avoid Thus, in pt\_CV, the following currency symbol information should be kept. -\ +\   \\ (Note: this is U+200B ZWSP, not nothing) @@ -23,13 +23,12 @@ Thus, in pt\_CV, the following currency symbol information should be kept.   \ \ -\ +\   \PTE\ -  \,\ +  \,\ -  \ \ +  \ \ -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/translation/date-time.md b/docs/site/translation/date-time.md index 959abffdeb1..1acb75c72e9 100644 --- a/docs/site/translation/date-time.md +++ b/docs/site/translation/date-time.md @@ -9,4 +9,3 @@ title: Date & Time - [Date/Time Patterns](https://cldr.unicode.org/translation/date-time/date-time-patterns) - [Date/Time Symbols](https://cldr.unicode.org/translation/date-time/date-time-symbols) -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/translation/date-time/date-time-names.md b/docs/site/translation/date-time/date-time-names.md index ec07033ebc0..eb06e802c2e 100644 --- a/docs/site/translation/date-time/date-time-names.md +++ b/docs/site/translation/date-time/date-time-names.md @@ -127,7 +127,7 @@ The time span associated with each code is different for different languages! - It shows the time span (with a 24 hour clock) for the code, and then an example (for the format codes). - You can also go to the web page [Day Periods](https://www.unicode.org/cldr/charts/45/supplemental/day_periods.html), and look for your language. - - For example, for Malayalam, you would go to ...[day\_periods.html#ml](https://www.unicode.org/cldr/charts/45/supplemental/day_periods.html#ml) , and see that **morning2** is the period that extends from **06:00** to **12:00**. + - For example, for Malayalam, you would go to ...[day\_periods.html#ml](https://www.unicode.org/cldr/charts/45/supplemental/day_periods.html#ml) , and see that **morning2** is the period that extends from **06:00** to **12:00**. | | Code | English | German | Russian | |---|---|---|---|---| @@ -249,4 +249,3 @@ There are a number of patterns like “the week of {0}” used for constructions \'week' W 'of' MMM\ -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/translation/date-time/date-time-patterns.md b/docs/site/translation/date-time/date-time-patterns.md index b89e8c60967..7c87bfc2766 100644 --- a/docs/site/translation/date-time/date-time-patterns.md +++ b/docs/site/translation/date-time/date-time-patterns.md @@ -395,4 +395,3 @@ Different calendars work with the data in Gregorian, and Generic in the followin - Because the Generic calendar does not have real names for months, weekdays and eras, the Survey Tool examples generated for this calendar may be confusing. - Calendars that do not inherit date formats from the Generic calendar are the **East Asian lunar calendars**: Chinese (lunar) and Dangi (Korean lunar). These have special formats involving cyclic names. The Dangu calendar inherits formats from the Chinese calendar data in the same locale, while the Chinese calendar inherits formats directly from the parent locale; that parent locale may be the root locale or inherit these formats directly from the root locale. For the lunar calendars, the root locale has formats that should be reasonable for use in most locales where the lunar calendars are not one of the primary calendars. -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/translation/date-time/date-time-symbols.md b/docs/site/translation/date-time/date-time-symbols.md index 85907c018f3..f393838ace2 100644 --- a/docs/site/translation/date-time/date-time-symbols.md +++ b/docs/site/translation/date-time/date-time-symbols.md @@ -6,15 +6,15 @@ title: Date/Time Symbols Symbols is a required topic to work in [Date/Time Patterns](https://cldr.unicode.org/translation/date-time/date-time-patterns) -More details on date/time symbols and patterns may be found in the Spec [Date Field Symbol Table](http://www.unicode.org/reports/tr35/tr35-dates.html#Date_Field_Symbol_Table). +More details on date/time symbols and patterns may be found in the Spec [Date Field Symbol Table](http://www.unicode.org/reports/tr35/tr35-dates.html#Date_Field_Symbol_Table). ## About Symbols -Dates and times are formatted using patterns, like "mm-dd". Within these patterns, each field, like the month or the hour, is represented by a sequence of letters (“pattern characters”) in the range A–Z or a–z. For example, sequences consisting of one or more ‘M‘ or ‘L‘ stand for various forms of a month name or number. +Dates and times are formatted using patterns, like "mm-dd". Within these patterns, each field, like the month or the hour, is represented by a sequence of letters (“pattern characters”) in the range A–Z or a–z. For example, sequences consisting of one or more ‘M‘ or ‘L‘ stand for various forms of a month name or number. -When the software formats a date for your language, a value will be substituted for each field, according to the following table. Examples of the pattern usage you may see in an every day use may be on the lock screen on a mobile device showing the date or time, or as a date stamp on an email. +When the software formats a date for your language, a value will be substituted for each field, according to the following table. Examples of the pattern usage you may see in an every day use may be on the lock screen on a mobile device showing the date or time, or as a date stamp on an email. -Notice in the table below that there are different pattern characters for standalone and formatting. For example M to indicate the formatting and L to indicate the standalone month names. +Notice in the table below that there are different pattern characters for standalone and formatting. For example M to indicate the formatting and L to indicate the standalone month names. Make sure you understand the difference between standalone and formatting patterns and use the appropriate symbols in patterns. See [when to use standalone vs. formatting](https://cldr.unicode.org/translation/date-time/date-time-patterns) in Date and Time patterns. @@ -52,7 +52,7 @@ Make sure you understand the difference between standalone and formatting patter ## Symbol Length -The number of letters in a field indicates the **format.** +The number of letters in a field indicates the **format.** The number of letters used to indicate the format is the same for all date fields EXCEPT for the year. (See table above for y and yy). @@ -62,7 +62,7 @@ The longer forms are only relevant for the fields that are non-numeric, such as ## Standalone vs. Format Styles - This section is relevant to [When to use standalone vs. Formatting](https://cldr.unicode.org/translation/date-time/date-time-patterns) in date/time patterns. + This section is relevant to [When to use standalone vs. Formatting](https://cldr.unicode.org/translation/date-time/date-time-patterns) in date/time patterns. Some languages use two different forms of strings (*standalone* and *format*) depending on the context. Typically the *standalone* version is the nominative form of the word, and the *format* version is in the genitive (or related form). @@ -107,5 +107,4 @@ Precede months with de or d’ - coordinate with the formats strings, which can' | | 2008-4-14 | abril | | d MMMM 'de' y | 2008-1-14 | 14 de gener de 2008 | | | 2008-4-14 | 14 d’abril de 2008 | - -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file + diff --git a/docs/site/translation/date-time/date-times-terminology.md b/docs/site/translation/date-time/date-times-terminology.md index 87a4437f72d..d494dbc3994 100644 --- a/docs/site/translation/date-time/date-times-terminology.md +++ b/docs/site/translation/date-time/date-times-terminology.md @@ -6,7 +6,7 @@ title: Date & Time terminology This topic is **in-progress** and and **not finalized** yet for use. -Following are terminology and definitions that are used for Date and Time structure and data in CLDR. The terminology used in CLDR have dependency on LDML Spec #35 and names of methods and objects in ICU. +Following are terminology and definitions that are used for Date and Time structure and data in CLDR. The terminology used in CLDR have dependency on LDML Spec #35 and names of methods and objects in ICU. | Terminology | Definition | Examples | |---|---|---| @@ -16,5 +16,4 @@ Following are terminology and definitions that are used for Date and Time struct | **calendar fields** | The abstract calendar type that is represented by the letters | era or year. | | calendar field “ **names**" | The localized names for each type of calendar field | “era” or “year”: | - -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file + diff --git a/docs/site/translation/displaynames.md b/docs/site/translation/displaynames.md index c78718d4a8c..8e89fd61759 100644 --- a/docs/site/translation/displaynames.md +++ b/docs/site/translation/displaynames.md @@ -13,4 +13,3 @@ title: Displaynames Displaynames category include the names of the fundamental data in internationalization: Languages and Country/Regions -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/translation/displaynames/countryregion-territory-names.md b/docs/site/translation/displaynames/countryregion-territory-names.md index 882a13980c1..384ef3696ad 100644 --- a/docs/site/translation/displaynames/countryregion-territory-names.md +++ b/docs/site/translation/displaynames/countryregion-territory-names.md @@ -4,7 +4,7 @@ title: Country/Region (Territory) Names # Country/Region (Territory) Names -Country and region names (referred to as Territories in the Survey Tool) may be used as part of [Language/Locale Names](https://cldr.unicode.org/translation/displaynames/languagelocale-names), or may be used in UI menus and lists to select countries or regions. +Country and region names (referred to as Territories in the Survey Tool) may be used as part of [Language/Locale Names](https://cldr.unicode.org/translation/displaynames/languagelocale-names), or may be used in UI menus and lists to select countries or regions. ## General Guidelines @@ -19,11 +19,11 @@ Please follow these guidelines: *The ISO names and the "official" names are often not necessarily the best ones.* The goal is the most customary name used in your language, even if it is not the official name. For example, for the territory name in English you would use "Switzerland" instead of "Swiss Confederation", and use "United Kingdom" instead of "The United Kingdom of Great Britain and Northern Ireland". One of the best sources for customary usage is to look at what common major publications such as newspapers and magazines do, the equivalents of *The Economist, NY Times, BBC, WSJ*, etc. in your language. You can look at style guides if available or at a sampling of pages, but favor publications’ rather than academic style guidelines. For example, to see how "Congo" is used in French, one might search [*for Congo on Le Monde*](http://www.google.com/search?q=Congo+site%3Alemonde.fr) and on other publications. -Also look at frequency data: for example, at the time of this writing, "Côte d’Ivoire" has [117M](https://www.google.com/search?hl=en&q=%22C%C3%B4te%20d%27Ivoire%22) hits on Google in English, while "Ivory Coast" has [99M](https://www.google.com/search?hl=en&q=%22Ivory%20Coast%22). That makes them roughly equal, and other factors come into play. Favor shorter names, all other things being (roughly) equal, and consider carefully politically sensitive names (see below). The most customary name may change over time, but this tends to happen slowly; we do not want changes between versions without good cause. +Also look at frequency data: for example, at the time of this writing, "Côte d’Ivoire" has [117M](https://www.google.com/search?hl=en&q=%22C%C3%B4te%20d%27Ivoire%22) hits on Google in English, while "Ivory Coast" has [99M](https://www.google.com/search?hl=en&q=%22Ivory%20Coast%22). That makes them roughly equal, and other factors come into play. Favor shorter names, all other things being (roughly) equal, and consider carefully politically sensitive names (see below). The most customary name may change over time, but this tends to happen slowly; we do not want changes between versions without good cause. ## Geopolitically Sensitive Names -Some country/region names need special treatment to avoid geopolitical sensitivity or ambiguity. +Some country/region names need special treatment to avoid geopolitical sensitivity or ambiguity. - In some cases, parentheses are used purely to disambiguate. For example: - Cocos (Keeling) Islands @@ -77,9 +77,9 @@ The following is a summary of these issues for some key regions. Some of these m ## Unique Names -**All names must be unique within a given category:** Names include countries, some parts of countries (such as Hong Kong) with special status, and so-called *macroregions*: continents and subcontinents, as defined by a UN standard. +**All names must be unique within a given category:** Names include countries, some parts of countries (such as Hong Kong) with special status, and so-called *macroregions*: continents and subcontinents, as defined by a UN standard. -Therefore, you cannot use the same translated names for different codes. For example: +Therefore, you cannot use the same translated names for different codes. For example: - For the codes CD and CG, *only one can be called "Congo".* - For the codes 018 and ZA, you can't give the same name to *South Africa* (the country) and to *Southern Africa* (the southern region of the continent of Africa), even though there may be no distinction in your language between the terms for "*South*" and "*Southern*". @@ -109,7 +109,7 @@ These override the normal constructions, which would be: | 1 | English (United States) | en_US | | | 2 | English (US) | en_US | alt=short | -If a particular language would just use the normal constructions, such as in the following, then the code "en\_US" should be the contents. +If a particular language would just use the normal constructions, such as in the following, then the code "en\_US" should be the contents. | | | | | |---|---|---|---| @@ -147,7 +147,7 @@ The names follow the same basic considerations as for Country/Region names. Ther 1. In general, favor making better-known entity be the shorter one. In some cases, it may be necessary to add a category to both of the names. 2. The category may be added in parentheses after the main name; just make sure it would look ok in the form in a list. -**Note:** There are three subdivisions in **Locale Display Names / Territories (Europe):** England, Scotland, and Wales. +**Note:** There are three subdivisions in **Locale Display Names / Territories (Europe):** England, Scotland, and Wales. Tip on translating these, for example, see [French](http://st.unicode.org/cldr-apps/v#/fr/T_Europe/). Distinguish the name for “England” from the name for “United Kingdom”, which includes England, Scotland, Wales, and Northern Ireland @@ -163,4 +163,3 @@ There are two special region names used for Pseudo Locales. These are special lo If there is no good term for "Pseudo" in your language, some options are the equivalent of "Fake" or "Artificial" in your language. -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/translation/displaynames/languagelocale-name-patterns.md b/docs/site/translation/displaynames/languagelocale-name-patterns.md index c7f21db84e2..c7ca4c4596d 100644 --- a/docs/site/translation/displaynames/languagelocale-name-patterns.md +++ b/docs/site/translation/displaynames/languagelocale-name-patterns.md @@ -20,4 +20,3 @@ For certain compound language (locale) names, you can also supply specific trans Code patterns are used to format a language, script or locale for display. For example, the language code pattern would be translated from "Language: {0}" in English to "langue : {0}" in French, and would be used to format the language "ouzbek" into "langue : ouzbek". -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/translation/displaynames/languagelocale-names.md b/docs/site/translation/displaynames/languagelocale-names.md index 6f7dfd9bf3f..83ae921be61 100644 --- a/docs/site/translation/displaynames/languagelocale-names.md +++ b/docs/site/translation/displaynames/languagelocale-names.md @@ -39,4 +39,3 @@ If your standard translation of the language name already puts the family name f Some languages may have other variant forms. For example, “ckb” may in English be called “Central Kurdish” or “Sorani Kurdish”; the former is used as the standard name for English, and the latter is the variant. In other languages the equivalent of “Sorani Kurdish” may be used as the standard name; if there is also an equivalent for “Central Kurdish” it may be supplied as the variant. If there is only one form in your language, please use it for both the standard and the variant form. -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/translation/displaynames/locale-option-names-key.md b/docs/site/translation/displaynames/locale-option-names-key.md index 0ac2525d400..8ce9320d29e 100644 --- a/docs/site/translation/displaynames/locale-option-names-key.md +++ b/docs/site/translation/displaynames/locale-option-names-key.md @@ -4,7 +4,7 @@ title: Locale Option Names (Key) # Locale Option Names (Key) -Locales can have special variants, to indicate the use of particular calendars, or other features. They be used to select among different options in menus, and also display which options are in effect for the user. +Locales can have special variants, to indicate the use of particular calendars, or other features. They be used to select among different options in menus, and also display which options are in effect for the user. ## Locale Option Names @@ -41,4 +41,3 @@ The following are some examples of Option+Value combinations that need translati For transform names (BGN, Numeric, Tone, UNGEGN, x-Accents, x-Fullwidth, x-Halfwidth, x-Jamo, x-Pinyin, x-Publishing), see [Transforms](https://cldr.unicode.org/translation/transforms). -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/translation/displaynames/script-names.md b/docs/site/translation/displaynames/script-names.md index d77e111338f..d0d75072753 100644 --- a/docs/site/translation/displaynames/script-names.md +++ b/docs/site/translation/displaynames/script-names.md @@ -11,7 +11,7 @@ Languages may be written with different scripts (aka writing systems). For examp - Chinese (Simplified) - Chinese (Traditional) -The most important scripts for this purpose are Cyrillic, Arabic, and Latin, plus the special codes Hant (Traditional) and Hans (Simplified). +The most important scripts for this purpose are Cyrillic, Arabic, and Latin, plus the special codes Hant (Traditional) and Hans (Simplified). Scripts may also be listed in a menu by themselves. @@ -26,4 +26,3 @@ For the script names, please follow these guidelines: See also [Language Names](https://cldr.unicode.org/translation/displaynames/languagelocale-names). -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/translation/error-codes.md b/docs/site/translation/error-codes.md index 084bb30cf6d..a80f8026b11 100644 --- a/docs/site/translation/error-codes.md +++ b/docs/site/translation/error-codes.md @@ -30,4 +30,3 @@ They allow for, and often need, duplicate placeholders. - For plurals and ordinals, make sure to read [Determining Plural Categories](http://cldr.unicode.org/index/cldr-spec/plural-rules#TOC-Determining-Plural-Categories). - For case and gender, make sure read [Grammatical Inflection](https://cldr.unicode.org/translation/grammatical-inflection). -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/translation/getting-started.md b/docs/site/translation/getting-started.md index eb9feaa98b3..934df460918 100644 --- a/docs/site/translation/getting-started.md +++ b/docs/site/translation/getting-started.md @@ -21,4 +21,4 @@ Before getting started to contribute data in CLDR, and jumping in to using the S \*If you (individual or your organization) have not established a connection with the CLDR technical committee, start with [Survey Tool Accounts](https://cldr.unicode.org/index/survey-tool/survey-tool-accounts). -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) + diff --git a/docs/site/translation/getting-started/data-stability.md b/docs/site/translation/getting-started/data-stability.md index 2606805804b..cdfc86e4b53 100644 --- a/docs/site/translation/getting-started/data-stability.md +++ b/docs/site/translation/getting-started/data-stability.md @@ -10,7 +10,6 @@ Please follow below tips to help with data stability: 1. Carefully review the previously Approved data before suggesting for a change. 1. When it's clearly incorrect, Add your suggestion and start a forum discussion - 2. Don't change the data when it is already acceptable (even if not optimal)-consider data preference vs. data inaccuracy. + 2. Don't change the data when it is already acceptable (even if not optimal)-consider data preference vs. data inaccuracy. 3. Bring evidence of a variant being much better and in customary use than the existing Approved data to the Forum discussions and gain consensus to change the Approved value. -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/translation/getting-started/empty-cache.md b/docs/site/translation/getting-started/empty-cache.md index d0991798b9e..905bf7aedb3 100644 --- a/docs/site/translation/getting-started/empty-cache.md +++ b/docs/site/translation/getting-started/empty-cache.md @@ -55,4 +55,3 @@ For additional information about Browser cache tips, see https://www.getfileclou ~~See~~ https://www.getfilecloud.com/blog/2015/03/tech-tip-how-to-do-hard-refresh-in-browsers/#.XRJaApMzbuM ~~for examples.~~ -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/translation/getting-started/errors-and-warnings.md b/docs/site/translation/getting-started/errors-and-warnings.md index e77776c7eb3..af26559d8ca 100644 --- a/docs/site/translation/getting-started/errors-and-warnings.md +++ b/docs/site/translation/getting-started/errors-and-warnings.md @@ -26,13 +26,13 @@ You can find some guidance under "Unique Names" in the following pages: - [Country/Region Names](https://cldr.unicode.org/translation/displaynames/countryregion-territory-names), - [City Names](https://cldr.unicode.org/translation/timezones#TOC-City-Names), - [Currency Symbols & Names](https://cldr.unicode.org/translation/currency-names-and-symbols/currency-names) - + **The characters ‎\[…\]‎ should not be used (Must fix)** For what to do, see [Characters](https://cldr.unicode.org/translation/-core-data/exemplars#TOC-Handing-Warnings-in-Exemplar-characters), in the section Handling Warnings. _While these are categorized as warnings, every effort should be made to fix them._ **Unquoted special character '.' in pattern (Must fix)** - + Number patterns can only contain an unquoted . when it is the decimal separator. @@ -58,4 +58,3 @@ Another common mistake is to copy a code value, such as "cs" for Czech, instead This may not be an error, because it is often perfectly legitimate to have an identical string.For example the script code for "Thai" is "Thai", which matches the English word exactly. So this warning is just to call your attention to the text in case it needs to be changed. -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/translation/getting-started/guide.md b/docs/site/translation/getting-started/guide.md index 62e762264cb..4b371695285 100644 --- a/docs/site/translation/getting-started/guide.md +++ b/docs/site/translation/getting-started/guide.md @@ -18,11 +18,11 @@ Note that the exact appearance in screenshots may change as the tool is enhanced - Please read the home page of the [Translation Guidelines](https://cldr.unicode.org/translation) before starting your data contribution. - If you experience a **Loading...** problem with the Survey Tool, try clearing your browser cache. See [Reloading JavaScript](https://www.filecloud.com/blog/2015/03/tech-tip-how-to-do-hard-refresh-in-browsers/#.XOjGNtMzbuM). -- **Browser support** for Survey Tool includes the latest versions of Edge, Safari, Chrome, and Firefox. +- **Browser support** for Survey Tool includes the latest versions of Edge, Safari, Chrome, and Firefox. - Use [Reports](https://cldr.unicode.org/translation/getting-started/guide#TOC-Reports) at the beginning to review the data in your language in a holistic manner for Date & time, Zones, and Numbers. - Capitalization: Translations should be what is most appropriate for ”middle-of-sentence” use. So, for example, if your language normally doesn't spell languages with a capital letter, then you shouldn’t do that here. Please see [Capitalization](https://cldr.unicode.org/translation/translation-guide-general/capitalization) for more details. - Plurals: For important information regarding the use of plural forms for your language, please read [Plurals](https://cldr.unicode.org/translation/getting-started/plurals). - + ### Vetting Phase At a point towards the end of Survey Tool period, the Technical Committee will change the survey tool to "Vetting Mode". In Vetting Mode, submitting new data/translations is no longer possible, but you can still change your votes and participate in the forum. (The exception is that you can submit new data if the currently winning value has generated an error or a warning.) @@ -41,42 +41,42 @@ At a point towards the end of Survey Tool period, the Technical Committee will c ![alt-text](../../images/gettingStartedGuideImportSelectedItems.jpeg) -1. Scroll to the bottom to see the category selection for bulk import. +1. Scroll to the bottom to see the category selection for bulk import. 2. Select the categories that you want to import and click **Import selected items** button at the bottom. 3. Go to the data categories in the Survey tool where you have imported your old votes, these will show up in the Others column with no votes. 4. Review and add your vote. The best practice is to create a forum entry explaining why this is the data that should be changed to and drive to gain consensus with other vetters. - + ### Picking Locales 1. On the left sidebar, you will see the CLDR locale(s). Your default view will be the languages you have permissions for. All the locales that you have permission to contribute submissions to are marked with PENCIL icon. You can view the others but not submit contributions. For example, if you have permissions to the default language Afrikaans (af), you will not have permissions to Afrikaans (Namibia), and vice versa. - + ![alt-text](../../images/gettingStartedGuideLocaleSearch.png) 1. Each language is followed by a list of regions that represent specific locales. The locale that is grayed out and preceded by an × is the default. The others are considered “sub-locales”. If you are working on the default locale, select the language name. For example, if you work on Spanish in general (default = Spain), you will see that that Spain is grayed out in the list below: choosing Spanish means that you are working on the default (Spanish for Spain). - + ![alt-text](../../images/gettingStartedGuideSpanish.jpg) 1. Only those of you working on a specific variant language (or "sub-locale") will pick a non-default region. If you work on Mexican Spanish, pick **Mexico**. (This should already be pre-selected for you.) - + _Make sure that you haven't mistakenly turned the Information Panel off! See_ [_**No Information Panel**_](https://cldr.unicode.org/translation/getting-started/guide%23TOC-No-Information-Panel)_**.**_ ### Voting view 1. Once you have selected your locale, more options show up in the left sidebar. (You’ll note that the sidebar only shows if you mouse over the **\>** character on the left.) -2. If the locale is relatively new and very complete, start working on the **Core Data** section and go through the rest of the sections. If the locale is mostly complete, then go to [**Dashboard**](https://cldr.unicode.org/translation/getting-started/guide) below. +2. If the locale is relatively new and very complete, start working on the **Core Data** section and go through the rest of the sections. If the locale is mostly complete, then go to [**Dashboard**](https://cldr.unicode.org/translation/getting-started/guide) below. 3. Once you have selected a section, you'll see a table to enter votes in. The main table has these columns: - - **Code**: the code CLDR uses to identify this data point. - - **English**: the plain English value of the data point (the text you are to translate). - - **Abstain**: the default vote value for you. Only use abstain if you don't know a good value to be used. - - **A**: The value’s current status. A checkmark means it’s approved and is slated to be used. A cross means it’s a missing value. (Note, for sub-locales, a cross is not necessarily bad. If the parent locale has a good value, the sub-locale will inherit it. Check the **Winning** column.) - - **Winning**: this is the currently winning value. If the survey tool would close now, this is the value we would publish. If the value has a blue star next to it, that means it’s also the value that was published in the previous version. Normally it takes at least two votes from two different organizations to change value: in some locales the bar is lower, and for some items it is higher. - - **Add**: If the winning value is not correct and is not listed under Others, then use the plus button here to enter the correct value. If you enter a new value, your vote will be applied to it automatically. - - If what you want is a variation of what is in Winning or Others, you can cut & paste, and then modify. + - **Code**: the code CLDR uses to identify this data point. + - **English**: the plain English value of the data point (the text you are to translate). + - **Abstain**: the default vote value for you. Only use abstain if you don't know a good value to be used. + - **A**: The value’s current status. A checkmark means it’s approved and is slated to be used. A cross means it’s a missing value. (Note, for sub-locales, a cross is not necessarily bad. If the parent locale has a good value, the sub-locale will inherit it. Check the **Winning** column.) + - **Winning**: this is the currently winning value. If the survey tool would close now, this is the value we would publish. If the value has a blue star next to it, that means it’s also the value that was published in the previous version. Normally it takes at least two votes from two different organizations to change value: in some locales the bar is lower, and for some items it is higher. + - **Add**: If the winning value is not correct and is not listed under Others, then use the plus button here to enter the correct value. If you enter a new value, your vote will be applied to it automatically. + - If what you want is a variation of what is in Winning or Others, you can cut & paste, and then modify. - **Others**: other suggested values, not currently winning, but available to vote for. 4. Click on one of the radio buttons to make your vote. The winning status changes in real-time so depending on vote requirements and existing votes, your vote may move your desired value to the winning column right away. - 1. Look at the Regional Variants to see if any should be changed: see **Information Panel** below. + 1. Look at the Regional Variants to see if any should be changed: see **Information Panel** below. 5. Once you are done with all the sections, go to the [**Dashboard**](https://cldr.unicode.org/translation/getting-started/guide)**.** 6. Under the English column, look for "**i**" for additional information and "**e**" for an example. ![alt-text](../../images/gettingStartedGuideArabic.png) @@ -128,11 +128,11 @@ You can click on the link in the right sidebar to see the original value. Language variants by Region are differentiated as Parent-locale and sub-locales. For example, - **Spanish es** is the parent (or the default) locale for all Spanish locales. Its default content is for Spanish (Spain) es\_ES. - + - **Spanish (Latin America) es\_419** is one of the sub-locales for Spanish. Votes on inheritance will ensure that it will only contain content that is different than what is in Spanish. - + - **Spanish (Argentina) es\_AR** is one of the sub-locales for Spanish (Latin America). Votes on inheritance will ensure that it will only contain content that is different than what is in Spanish (Latin America) - + The regional variants menu for a data point is shown on the right navigation information pane. It will look something like the following (the exact appearance depends on the browser). @@ -154,10 +154,10 @@ If you are voting in a sub-locale such as en\_AU, es\_MX, fr\_CA etc., you can v An inheritance vote is useful if there are no differences in spelling conventions and political relations between your locale and the parent locale. Abstaining from voting may have the same effect, but if another vetter votes for something different, your Abstained vote means that it's not opposed by you; thus, your intention is not known to others. By voting for the blue inheritance value you make your opinion known to other vetters. -- Inheritance is important, to prevent data duplication. -- Inheritance is not only limited to “sub locales”. Parent locales (or default language locales) also have inheritance from either other fields or the root. -- By default, all data are inherited if there are no contributions. The data are indicated as Missing or Abstain. Sub-locales have inherited values that are generally from the parent locale (e.g. de\_CH will inherit values from de\_DE). -- The inherited values appear in the **Others** column highlighted in blue box (e.g. “embu” and "inglês"). By clicking the radio button in front of those values, you are voting for inheritance. +- Inheritance is important, to prevent data duplication. +- Inheritance is not only limited to “sub locales”. Parent locales (or default language locales) also have inheritance from either other fields or the root. +- By default, all data are inherited if there are no contributions. The data are indicated as Missing or Abstain. Sub-locales have inherited values that are generally from the parent locale (e.g. de\_CH will inherit values from de\_DE). +- The inherited values appear in the **Others** column highlighted in blue box (e.g. “embu” and "inglês"). By clicking the radio button in front of those values, you are voting for inheritance. - If the inherited value is not correct for your locale or it’s likely for your locale to change the data in the future, click the + button, and enter a new suggestion. The vote status column will show an orange-up arrow () if the winning item is inherited and it does not have any votes. @@ -185,7 +185,7 @@ _Progress bar shows progress of items overall for your coverage level._ The Dashboard will show you a list of data items with warnings of different kinds. Some will require action, some may be false positives. (For the veterans, this is the redesigned Priority Viewer.) -![alt-text](../../images/gettingStartedGuideDashboard.png) +![alt-text](../../images/gettingStartedGuideDashboard.png) The goal is that you should work the Dashboard down to show zero items, then review the [**Reports**](https://cldr.unicode.org/index/survey-tool/guide#TOC-Reports), below. @@ -201,14 +201,14 @@ At the top of the Dashboard is a header with a button for each section the title There are six columns in the Dashboard view. -- **Dashboard category**: The first letter of the section name enclosed in a circle. +- **Dashboard category**: The first letter of the section name enclosed in a circle. - **Data Type**: The section that the item belongs to. -- **Code**: this links to the field in survey tool. Click on it to go to the item in the Survey Tool. -- **English**: The English value is highlighted in blue. +- **Code**: this links to the field in survey tool. Click on it to go to the item in the Survey Tool. +- **English**: The English value is highlighted in blue. - **Winning** _**XX**_: The currently winning value is highlighted in green. - **Hide checkbox**: For items that can be hidden a checkbox to hide that option appears on the far right. -![alt-text](../../images/gettingStartedGuideDashboardCols.png) +![alt-text](../../images/gettingStartedGuideDashboardCols.png) ### How to handle different categories @@ -217,13 +217,13 @@ Following are guidelines on best practices for handling items under each categor - **Missing** - These are items where there is no localization provided by any contributor. Click on the line to be taken to the item in the Survey Tool where items are highlighted and you can add a translation. When you fix a **Missing** item it will turn to **Changed**. -![alt-text](../../images/gettingStartedGuideMissing.png) +![alt-text](../../images/gettingStartedGuideMissing.png) - **Losing** - These are items that you already voted on. This indicates that your vote is not for the currently winning value. If you can live with the winning item—if it is reasonable, even if you don't think it is optimal—change your vote to be for the winning item. If not, click the **Forum** button in the **Info Panel** and give reasons for people to change their vote to what you have suggested. If not all users have voted yet, these values may still be approved before the end of the cycle. **Engage with others on the Forum discussions**. Make sure to post the reasons why others should change their votes and **respond to others’ posts**. - **Disputed** - This is very similar to **Losing**, except in this case your vote is winning and someone else's is losing. Review all of the items to see if someone else’s item is better and read the forum post, and whether you want to change your vote. Discuss in the forum, then use the Hide button to hide disputes you’ve addressed in the forum. -- **Changed** +- **Changed** - The Changed count is provided in the Dashboard only as a reference. The **Changed** items are either: - Missing items now have a value. - The Winning value of the translation has been changed. @@ -232,7 +232,7 @@ Following are guidelines on best practices for handling items under each categor - **Warnings** - These are issues which appear after automatic checks. (For examples, a message could be "_The value is the same as English"_, which is a quite common warning for languages that are close to English in the spelling of languages or territories. If the value is actually ok, then click on the Hide button (crossed eye). If not, then vote for a fix, or post on the Forum for discussion. -![alt-text](../../images/gettingStartedGuideWarning.png) +![alt-text](../../images/gettingStartedGuideWarning.png) ### Dashboard Summary @@ -240,12 +240,12 @@ There are two ways to clear items from the **Dashboard** list: 1. Fix them (such as adding a translation for a missing item) 2. Hide them (such as when the English has changed but the translation doesn’t need to change). - - _**Only**_ _hide items if it really is a false positive,_ _**not**_ _because you gave up on fixing it…_ - - _If you hide an item by mistake:_ - - _Unhide all the lines with the top eye button._ - - _Click on the orange eye button in the line (a “Show" tooltip will appear)._ + - _**Only**_ _hide items if it really is a false positive,_ _**not**_ _because you gave up on fixing it…_ + - _If you hide an item by mistake:_ + - _Unhide all the lines with the top eye button._ + - _Click on the orange eye button in the line (a “Show" tooltip will appear)._ - _Hide all the lines again by clicking the top eye button._ - + ## Reports @@ -255,11 +255,11 @@ Reports are under the left navigation. Reports are a good way to review the data in your language in a wholistic view for the Date and time, Zones, and Numbers. -![alt-text](../../images/gettingStartedGuideKorean.png) +![alt-text](../../images/gettingStartedGuideKorean.png) _Example:_ -![alt-text](../../images/gettingStartedGuidePatterns.jpeg) +![alt-text](../../images/gettingStartedGuidePatterns.jpeg) ## Special cases @@ -286,8 +286,8 @@ Some items have change protection in place that will stop vetters from changing To change it, you have to flag the item for committee review: 1. Click on the “**Flag for Review button**”. -2. On the new page, you'll see a message box. -3. Enter the change that you want to make, and add a justification for changing it. +2. On the new page, you'll see a message box. +3. Enter the change that you want to make, and add a justification for changing it. 4. Then click **“Post**”. 5. Towards the end of data collection cycle, the Technical Committee will review the change request and either accept it, or reject it with comments. @@ -304,7 +304,7 @@ It's a best practice to **create a Forum post whenever you propose a change to a While creating New Posts on Forum or participating in discussions please follow these general etiquette guidelines for best productive outcomes: - Be professional. Provide accurate, reasoned answers so that other participants can easily understand what you are talking about. -- Be courteous. Refrain from inappropriate language and derogatory or personal attacks. +- Be courteous. Refrain from inappropriate language and derogatory or personal attacks. - Don’t “SHOUT”; that is don’t use all capitals. - In case of disagreement, focus on the data and provide evidence to support your position. Avoid challenges that may be interpreted as a personal attack. - Be prepared to have your own opinions challenged or questioned, but don’t take answers personally. @@ -315,10 +315,10 @@ While creating New Posts on Forum or participating in discussions please follow Forum posts work with the following workflow: -1. Create a new **Request** +1. Create a new **Request** 2. Responses by other vetters in your language with Agree, Decline, or Comment. 3. Once resolved, the creators of the the initial Request or Discuss closes the post. - + ### How to create a new forum post @@ -326,7 +326,7 @@ A forum post can be specific to a particular data point or a general issue. In e - A post that is specific to a particular data point. - A general issue that impacts multiple data points. In a general case that impacts multiple data points, you do not need to post new forum posts for every item impacted. The general issue should be flagged to other vetters and once a consensus is reached, it is expected that vetters update their votes on all impacted items. New forum posts can be used to flag to other vetters if others fail to update their votes on all impacted items. ONLY request if others have missed or have not updated consistently. - + **Create forum posts from the** [**Information pane**](https://cldr.unicode.org/translation/getting-started/guide#TOC-Information-Panel)**l in the voting window.** @@ -335,7 +335,7 @@ A forum post can be specific to a particular data point or a general issue. In e ![alt-text](../../images/gettingStartedGuideVote.png) 2. In the Information panel on the right, there are two buttons to indicate the type of forum posts: - 1. **Request** You have voted on a non-winning item, and you want to Request others to change their votes. + 1. **Request** You have voted on a non-winning item, and you want to Request others to change their votes. 2. **Discuss -** Currently only TC members can make discuss posts. 3. Click **Request** button and fill out the details of your request. (Note: The **Request** button is disabled unless you have voted) @@ -376,7 +376,7 @@ In the **Info Panel**, select the **Comment** button There are two ways to respond to new forum post: - Info Panel (This is the recommended option.) - In the Forum view (See [Working in the Forum view](https://cldr.unicode.org/translation/getting-started/guide#TOC-Working-in-the-Forum-view-)) - + **Respond to forum posts from the** [**Info Panel**](https://cldr.unicode.org/translation/getting-started/guide#TOC-Information-Panel) **in the voting window.** 1. In the Info Panel click the **Comment** button and add your input to the open discussion. @@ -420,9 +420,9 @@ If you run into a problem with the Survey Tool functionalities, please see [FAQ **Email notification** -1. Another way to check for posts that may need your attention is to review email notifications to the e-mail account for your locale. You can delete these notifications if they are for changes initiated by you. You can open the post directly from a link in the email. +1. Another way to check for posts that may need your attention is to review email notifications to the e-mail account for your locale. You can delete these notifications if they are for changes initiated by you. You can open the post directly from a link in the email. 2. When you make a forum entry, it will be emailed to all other linguists working on locales with the same language, parent or sub-locale (i.e. **forum is at Language level and not at sub-locale level**). If you are talking about a translation in a sub-locale, be sure that you are clear about that. - + ### Forum posts for CLDR ticket feedback @@ -434,14 +434,13 @@ The goal is to bring it to the attention to all linguists contributing in a part 2. For each ticket assigned to them, the TC member will post a forum topic in each language mentioned in the ticket, asking for vetters to look at the issue and either make the requested change, or explain in a forum post why changes should not be made. 3. A reason for not changing could be for example that it is a reasonable change, but doesn't exceed the 'stability' bar in the translation guidelines. 4. TC members will monitor the forum discussion/change during the Submission phase, and will close the JIRA ticket after the forum discussion is concluded. - + ## Advanced Features -1. Users familiar with CLDR XML format can upload votes (and submissions) for multiple items at once. See [**Bulk Data Upload**](https://cldr.unicode.org/index/survey-tool/bulk-data-upload)**.** +1. Users familiar with CLDR XML format can upload votes (and submissions) for multiple items at once. See [**Bulk Data Upload**](https://cldr.unicode.org/index/survey-tool/bulk-data-upload)**.** 2. Organization managers can manage users for their organization (add, remove, send passwords, set locales, etc.) For more information, see [**Managing Users**](https://cldr.unicode.org/index/survey-tool/managing-users)**.** - 1. Some users may want to reset their Coverage Level, with the menu that looks like the image to the right. + 1. Some users may want to reset their Coverage Level, with the menu that looks like the image to the right. 2. The Coverage Level determines the items that you will see for translation: the minimal level has the highest priority items. You normally start with the level marked "Default" (which will vary by your organization and locale). Each successively higher level adds more items, at successively lower priorities. You will not normally go beyond "Modern", unless you have special instructions for your organization. 3. _Note that some companies won't use the data until it is complete at a certain coverage level, typically_ _**Modern**._ -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/translation/getting-started/plurals.md b/docs/site/translation/getting-started/plurals.md index 9309c62f23a..faed65ab660 100644 --- a/docs/site/translation/getting-started/plurals.md +++ b/docs/site/translation/getting-started/plurals.md @@ -31,7 +31,7 @@ As well as being used for durations, like "3.5 hours", they can also be used for 1. **D**entro de {0} horas 2. **h**ace {0} años - + Each unit may have multiple plural forms, one for each category (see below). These are composed with numbers using a _unitPattern_. A formatted number will be substituted in place of the number placeholder. For example, for English if the unit is an hour and the number is 1234, then the number is looked up to get the rule category _other_. The number is then formatted into "1,234" and composed with the unitPattern for _other_ to get the final result. Examples are in the table below for the unit **hour**. @@ -60,7 +60,7 @@ Some techniques for shortening the _narrow_ or _short_ form include: 1. Drop the space between the value and the unit: “{0}km” instead of “{0} km”. 2. Use symbols like km² or / instead of longer terms like “Quadrat” or “ pro ”. 3. Use symbols that would be understood in context: eg “/h” for “ per hour” when the topic is speed, or "Mi" for mile(s) when the topic is distance. -4. Replace the qualifiers "English" or "American" by an abbreviation (UK, US), or drop if most people would understand that the measurement would be an English unit (and not, say, an obsolete German or French one). +4. Replace the qualifiers "English" or "American" by an abbreviation (UK, US), or drop if most people would understand that the measurement would be an English unit (and not, say, an obsolete German or French one). 5. Use narrow symbols for CJK languages, such as “/” instead of “/”. Which of these techniques you can use will depend on your language, of course. @@ -79,10 +79,9 @@ Minimal pairs are used to verify the different grammatical features used by a la - **Plurals (cardinals) and Ordinals.** See [Determining Plural Categories](https://cldr.unicode.org/index/cldr-spec/plural-rules#TOC-Determining-Plural-Categories). - **Grammatical Case and Gender.** See [Grammatical Inflection](https://cldr.unicode.org/translation/grammatical-inflection) - + ## Compound Units Units of measurement can be formed from other units and other components. For more information, see [Compound Units](https://cldr.unicode.org/translation/units/unit-names-and-patterns). -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/translation/getting-started/resolving-errors.md b/docs/site/translation/getting-started/resolving-errors.md index 6486f55a7a0..c0a6a7417ff 100644 --- a/docs/site/translation/getting-started/resolving-errors.md +++ b/docs/site/translation/getting-started/resolving-errors.md @@ -19,16 +19,16 @@ There are two Errors or Warnings that you may see in the SurveyTool, and these e 1. **Error type 1: "Incomplete Logical Group"** 1. This is most serious and it means that one or more items in what's considered as a logical group has been added; however, in doing so at least one other is missing (✘). - 1. To fix: Make sure that values for ALL of the items in the logical group are there. + 1. To fix: Make sure that values for ALL of the items in the logical group are there. 2. An example: vote/enter values for **all** of the month names. Once you enter values for all the items in a logical group, this error will disappear. 2. Error type 2: "**Inconsistent Draft Status**" - 1. This happens when the voting results would leave one of items in a group having a lower draft status (✔︎ approved, ✔︎ contributed, ✘ provisional, ✘ unconfirmed) than some other item in the group. - 2. **All** of the items have to have the same status. + 1. This happens when the voting results would leave one of items in a group having a lower draft status (✔︎ approved, ✔︎ contributed, ✘ provisional, ✘ unconfirmed) than some other item in the group. + 2. **All** of the items have to have the same status. 3. To fix: Go through all your votes and use the forum to coordinating with other vetters and come to an agreement on all items in the group. 3. Error type 3: "**This item has a lower draft status (in its logical group) than X.**". 1. same as Error type 2. 2. Inherited items can count as errors if they are part of a Logical Group. The easiest way to resolve these are to explicitly vote for the inherited or aliased items. Here is an example, before and after. - + #### Before @@ -38,4 +38,3 @@ There are two Errors or Warnings that you may see in the SurveyTool, and these e ![image](../../images/Screen-Shot-2017-08-22-at-12.15.08.png) -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/translation/getting-started/review-formats.md b/docs/site/translation/getting-started/review-formats.md index 33389327a07..128addcc11f 100644 --- a/docs/site/translation/getting-started/review-formats.md +++ b/docs/site/translation/getting-started/review-formats.md @@ -21,7 +21,7 @@ Once you are done, check the appropriate item in the top window: - **I have reviewed the items below, and they are all acceptable** - **The items are not all acceptable, but I have entered in votes for the right ones and filed at least one forum post to explain the problems.** - **I have not reviewed the items.** - + Sometimes there will be a structural problem that cannot be fixed by votes, even if all the vetters agree. For example, Latvian needs [some extra support](https://unicode-org.atlassian.net/browse/CLDR-16231) for formatting person names. In that case, you should file a ticket to report the problem. Don't file a ticket if the problem can be solved by you and the other vetters changing your votes. @@ -31,23 +31,23 @@ To get started, in the Survey tool, open the **Reports** from the left navigatio ### General Tips -1. To correct the data, use the **View** links on the right of each line in Reports to go directly to the field and correct the data. Sometimes the 'view' can't go to the exact line, where there are multiple items involved in the formatting. +1. To correct the data, use the **View** links on the right of each line in Reports to go directly to the field and correct the data. Sometimes the 'view' can't go to the exact line, where there are multiple items involved in the formatting. 2. File at least one forum request where you need others to change their votes. If it is a general problem, such as the capitalization being wrong for all abbreviated months, you can file a forum request for the first one, and state the general problem there. ### Examples of Problems Check for consistency between different forms by looking at them side by side. -1. The casing is inconsistent; (e.g. some months are capitalized and others are lower cased). see. capitalization rule. +1. The casing is inconsistent; (e.g. some months are capitalized and others are lower cased). see. capitalization rule. 2. Spelling consistency. 3. Use of hyphens in some rows/columns, but not in others. 4. Some abbreviations have periods and others don't, or some use native 'periods' and others use ASCII ".". - + ### Person Name tips 1. Please read the [Miscellaneous: Person Name Formats](https://cldr.unicode.org/translation/miscellaneous-person-name-formats) under "**Review Report**". - + ### Date & Time Review Tips @@ -58,7 +58,7 @@ Check for consistency between different forms by looking at them side by side. ### Number formats Review tips -1. Each forms should be acceptable for your locale. +1. Each forms should be acceptable for your locale. 2. Review the cells within each row for consistency. 3. Also look for consistency across the rows for consistency. 4. Check that each cell has the correct plural form (if your language has plural forms). @@ -67,11 +67,10 @@ Check for consistency between different forms by looking at them side by side. ### Zones Review tips -1. The first two columns identify the timezone (metazone) +1. The first two columns identify the timezone (metazone) 2. Compare the items in each row for consistency. 3. Compare the items in the same column across different rows. 4. City names that use hyphens do not show the hyphens in patterns because they are constructed from the city name and the pattern {0} Zeit. Consider whether it would be better to always remove the hyphens, or to add them to the pattern {0}-Zeit. - + ![image](../../images/review-zone.PNG) -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/translation/getting-started/survey-tool-phases.md b/docs/site/translation/getting-started/survey-tool-phases.md index 9d3226b326a..019fd776c28 100644 --- a/docs/site/translation/getting-started/survey-tool-phases.md +++ b/docs/site/translation/getting-started/survey-tool-phases.md @@ -35,9 +35,9 @@ Shakedown is on an invitation basis. If you have not received an invitation to p You should know in this stage: -- The survey tool is **live and all data that you enter will be saved and used.** -- You can start work -- Expect a churn: there may be additional Tooling fixes and Data additions during this period.  +- The survey tool is **live and all data that you enter will be saved and used.** +- You can start work +- Expect a churn: there may be additional Tooling fixes and Data additions during this period.  - Tool may be taken down for updates more frequently during general submission - You are expected to look for issues with the Survey tool and any other problems you encounter as a vetter. Please [file a ticket](https://cldr.unicode.org/index/bug-reports). @@ -49,9 +49,9 @@ For new locales or ones where the goal is to increase the level, it is best to p Then please focus on the [Dashboard](https://cldr.unicode.org/translation/getting-started/guide#h.bmzr9ejnlv1u) view, -1. Get all **Missing**† items entered -2. Vote for all **Provisional** items (where you haven't already voted) -3. Address any remaining **Errors\*** +1. Get all **Missing**† items entered +2. Vote for all **Provisional** items (where you haven't already voted) +3. Address any remaining **Errors\*** 4. Review the **English Changed** (where the English value changed, but your locale's value didn't. These may need adjustment.) - \* Note that if the committee finds systematic errors in data, new tests can be added during the submission period, resulting in new **Errors**. - † Among the _**Missing**_ are are new items for translation. (On the [Dashboard](https://cldr.unicode.org/translation/getting-started/guide#h.bmzr9ejnlv1u), **New** means winning values that have changed since the last release.) @@ -66,10 +66,9 @@ All contributors are encourage to move their focus to the [Dashboard](https://cl 2. Open the Dashboard, and resolve all of the Errors, Provisional Items, Disputed items, and finish Reports 1. Consider other's opinions, by reviewing the **Disputed** and the **Losing**. See guidelines for handling [Disputed](http://cldr.unicode.org/translation/getting-started/guide#TOC-Disputed) and [Losing](http://cldr.unicode.org/translation/getting-started/guide#TOC-Losing). 3. Review all open Requests and Discussions in the [Forums](https://cldr.unicode.org/translation/getting-started/guide#h.fx4wl2fl31az), and respond. - + ### Resolution (Closed to vetters) The vetting is done, and further work is being done by the CLDR committee to resolve problems. You should periodically take a couple of minutes to check your [Forums](https://cldr.unicode.org/translation/getting-started/guide#h.fx4wl2fl31az) to see if there are any questions about language-specific items that came up. -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/translation/getting-started/vetting-view.md b/docs/site/translation/getting-started/vetting-view.md index 074f582ebb4..6732031f596 100644 --- a/docs/site/translation/getting-started/vetting-view.md +++ b/docs/site/translation/getting-started/vetting-view.md @@ -179,4 +179,4 @@ For the Unsync'd issues, the older English value is listed below the new one, an Once you have completed your items, review the Priority Items again to see that all the changes are as expected. -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) + diff --git a/docs/site/translation/grammatical-inflection.md b/docs/site/translation/grammatical-inflection.md index cf9e26825af..f9f6b55cd24 100644 --- a/docs/site/translation/grammatical-inflection.md +++ b/docs/site/translation/grammatical-inflection.md @@ -35,7 +35,7 @@ You do not have to supply the *sample unit (column 4)* in the Survey Tool, but y ***It is absolutely crucial that you make sure that each Pattern is constructed in a way so that none of the samples could also correctly fit in the Composed Value on a different row.*** For example, you couldn't put "Jahr" (a neuter unit from #3) into the pattern in #2, as the result - "Die Jahr ist …" - would not be grammatically correct (feminine article + neuter noun). -In English, all of these would say "The {0} is...", with {0} e.g. being a day, a week, or a year (could be other units as well, doesn't have to be duration). +In English, all of these would say "The {0} is...", with {0} e.g. being a day, a week, or a year (could be other units as well, doesn't have to be duration). English has only one gender for all common nouns, and thus the English column on the Survey Tool page is blank. @@ -56,7 +56,7 @@ The equivalent Italian patterns might be: - ***Don't worry about elision or mutation like changing la to l'****;* the phrases **don't have to** work for **all** items of that gender if there are side effects because of certain letters. - Not all possible placeholder values will exhibit differences. For example, in German only masculine noun phrases are different in the accusative case (*den* Mann vs. *der* Mann). Sometimes you may need a combination of two different units (A *and* B), as in the example below. -Here is an example from [German](https://st.unicode.org/cldr-apps/v#/de/MinimalPairs). Notice that this case is more complicated than gender, because we need a noun phrase (with an adjective, "British") and two separate measures, because no single noun-adjective combination has differences in all 4 cases in German. The "British teaspoon" is different in 3 cases, but the genitive is the same as the dative. The "British gallons" is different in 3 cases, but the nominative and accusative are identical. +Here is an example from [German](https://st.unicode.org/cldr-apps/v#/de/MinimalPairs). Notice that this case is more complicated than gender, because we need a noun phrase (with an adjective, "British") and two separate measures, because no single noun-adjective combination has differences in all 4 cases in German. The "British teaspoon" is different in 3 cases, but the genitive is the same as the dative. The "British gallons" is different in 3 cases, but the nominative and accusative are identical. ***It is absolutely crucial that you make sure that each Pattern is constructed in a way so that none of the samples could also correctly fit in the Composed Value on a different row.*** *For example, in the table below the nominative sample phrase could not be correctly substituted into the accusative pattern.* That would produce "für 1 britisch**er** Teelöffel und 3 britische Gallonen", which would be incorrect. @@ -83,7 +83,7 @@ The Code is now longer, allowing for case and gender: - If your language doesn’t have gender, it will be “dgender”. - That is also used for a selected instance for gendered languages: either *neuter* or *masculine* (if the language has no neuter). -For examples, see [French Compound Units](https://st.unicode.org/cldr-apps/v#/fr/CompoundUnits/) +For examples, see [French Compound Units](https://st.unicode.org/cldr-apps/v#/fr/CompoundUnits/) - **long-one-nominative-dgender** requests the **masculine** case in French - **long-one-nominative-feminine** requests the **feminine** case in French @@ -92,7 +92,7 @@ For examples, see [French Compound Units](https://st.unicode.org/cldr-apps/v#/fr ## Units > ... > gender -For example, see [French gender for Liter](https://st.unicode.org/cldr-apps/v#/de/Volume/1027df24bd31941e) +For example, see [French gender for Liter](https://st.unicode.org/cldr-apps/v#/de/Volume/1027df24bd31941e) If your language supports gender for units of measure, you’ll see a new element for the gender of each relevant unit. For example, a Liter in German is masculine. If you try to put in an incompatible value, you’ll get a message that lists the valid values for your locale, such as @@ -138,4 +138,3 @@ Hovering over the winning value will also show an example of how the prefix or p ![image](../images/translation/Screen-Shot-2021-06-10-at-22.07.00.png) -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/translation/language-specific.md b/docs/site/translation/language-specific.md index 2ac8296b1a8..f0ba3133f37 100644 --- a/docs/site/translation/language-specific.md +++ b/docs/site/translation/language-specific.md @@ -10,4 +10,3 @@ The following pages have guidance for specific languages: - [Odia](https://cldr.unicode.org/translation/language-specific/odia) - [Persian](https://cldr.unicode.org/translation/language-specific/persian) -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/translation/language-specific/lakota.md b/docs/site/translation/language-specific/lakota.md index 00ffdd6126a..fd09c46b17c 100644 --- a/docs/site/translation/language-specific/lakota.md +++ b/docs/site/translation/language-specific/lakota.md @@ -16,4 +16,3 @@ Please use the following forms of the non-A-Z letters when entering data for Lak | Standard digraphs using special letters (you can use these or combine the single letters above) | aŋ čh čʼ iŋ kȟ kʼ pȟ pʼ tȟ tʼ uŋ | | Additional digraphs (you can use these or combine the single letters above) | ȟʼ sʼ šʼ | -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/translation/language-specific/odia.md b/docs/site/translation/language-specific/odia.md index bceb78209f0..73def57dade 100644 --- a/docs/site/translation/language-specific/odia.md +++ b/docs/site/translation/language-specific/odia.md @@ -6,7 +6,7 @@ title: Odia **Translation approach - transliteration and diacritics in Odia** -New agreement for Odia Translation guide: +New agreement for Odia Translation guide: 1. Avoid the use of diacritics when transliteration is required in Oriya - diacritics can be easily understood by Oriya well-versed users, but plain transliteration (without diacritics) is more common and preferred. @@ -24,4 +24,3 @@ Follow the General Guidelines for Country/region names: 1. That said and given the fact that diacritics change pronunciation in Oriya (for example, ଲଣ୍ଡନ୍ will be pronounced as London but ଲଣ୍ଡନ will be pronounced as Londonaw), a transliteration approach with regular adoption of diacritics could potentially trigger confusion among not-well-versed users. 2. With these considerations in mind and with the goal of achieving consistency across categories and companies, Google linguists are open to the introduction of a general translation guideline in favor of transliteration without adoption of diacritics -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/translation/language-specific/persian.md b/docs/site/translation/language-specific/persian.md index 3eddc135022..d90e4e479b9 100644 --- a/docs/site/translation/language-specific/persian.md +++ b/docs/site/translation/language-specific/persian.md @@ -59,4 +59,3 @@ TBD TBD -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/translation/miscellaneous-displaying-lists.md b/docs/site/translation/miscellaneous-displaying-lists.md index d9294fef99c..28dd2d89ae9 100644 --- a/docs/site/translation/miscellaneous-displaying-lists.md +++ b/docs/site/translation/miscellaneous-displaying-lists.md @@ -23,4 +23,4 @@ There may be some variance within the language. For example, there are two diffe Pick the most neutral formulation you can, so that it works with as many kinds of noun phrases as possible. -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) + diff --git a/docs/site/translation/miscellaneous-person-name-formats.md b/docs/site/translation/miscellaneous-person-name-formats.md index 6dc3d96daa6..23e00f9449d 100644 --- a/docs/site/translation/miscellaneous-person-name-formats.md +++ b/docs/site/translation/miscellaneous-person-name-formats.md @@ -18,4 +18,3 @@ The main topics are: - [Sample Name Fields For X](https://docs.google.com/document/d/1mjxIHsb97Og8ub6BKWxOihcHz7zjU4GdFkIxWHGAtes/edit#heading=h.qr8y56vvjgr8) - [Name Patterns](https://docs.google.com/document/d/1mjxIHsb97Og8ub6BKWxOihcHz7zjU4GdFkIxWHGAtes/edit#heading=h.3me9x0ulvxtz) -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/translation/number-currency-formats.md b/docs/site/translation/number-currency-formats.md index c8260b1c9a5..e5e7336ba68 100644 --- a/docs/site/translation/number-currency-formats.md +++ b/docs/site/translation/number-currency-formats.md @@ -8,4 +8,3 @@ title: Number & Currency formats - [Number symbols](https://cldr.unicode.org/translation/number-currency-formats/number-symbols) - [Other patterns](https://cldr.unicode.org/translation/number-currency-formats/other-patterns) -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/translation/number-currency-formats/number-and-currency-patterns.md b/docs/site/translation/number-currency-formats/number-and-currency-patterns.md index 05ca3e241e4..2909f2e78bf 100644 --- a/docs/site/translation/number-currency-formats/number-and-currency-patterns.md +++ b/docs/site/translation/number-currency-formats/number-and-currency-patterns.md @@ -43,7 +43,7 @@ Whenever any of these symbols are in the English pattern, they **must be retaine - Move the the currency symbol (¤) or percent sign (%), if it is used in a different position. - To deal with CLDR's default automatic space handling in place for currency symbol when using a currency code(e.g. USD): - Do NOT add a space ¤#,##0.### for result: $12 and USD 12. - - ADD a manual space ¤ #,##0.### for result: $ 12 and USD 12. + - ADD a manual space ¤ #,##0.### for result: $ 12 and USD 12. - Always verify with examples in the right information pane and see how the data from number symbols are used in formatting numbers. The final results of the number formatting will show the correct symbols for decimal, grouping, etc... from [Number Symbols](https://st.unicode.org/cldr-apps/v#/USER/Symbols/) in your locale. - For bidi scripts (e.g. Arabic and Hebrew) you may need to add directionality markers (U+200E (\ LEFT-TO-RIGHT MARK, U+200F \ RIGHT-TO-LEFT MARK, U+061C \ ARABIC LETTER MARK) - For number formats in bidi scripts, the Survey Tool shows examples in both a right-to-left context and a neutral context (with both positive and negative numeric values). In the future it may show examples in a left-to-right contex as well. @@ -103,7 +103,7 @@ There are also patterns for compact forms of numbers, such as the such as "1M" a | 14-digit-short-one (10000000000000) | **10,000,000,000,000** | 00T | 12T | 00兆 | 12兆 | | 15-digit-short-one (100000000000000) | **100,000,000,000,000** | 000T | 123T | 000兆 | 123兆 | - + When computer programs use CLDR, the number of decimals can be changed by computer programs according to the task its designed for. For example, the pattern for 10,000 in the table below (00K for English, 0万 for Japanese) may be modified to have more or fewer decimals — it could be changed to have 3 digits of accuracy: 00.0K for English, 0.00万 for Japanese. 💡 **Translation Tips** @@ -128,4 +128,3 @@ X digit-two Xdigit-other -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/translation/number-currency-formats/number-symbols.md b/docs/site/translation/number-currency-formats/number-symbols.md index 136dcdbe155..acd4906eddf 100644 --- a/docs/site/translation/number-currency-formats/number-symbols.md +++ b/docs/site/translation/number-currency-formats/number-symbols.md @@ -31,4 +31,3 @@ For English regional locales (e.g. en\_DE) where English is not the primary lang ![image](../../images/number-currency-formats/number-symbol.JPG) -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/translation/number-currency-formats/other-patterns.md b/docs/site/translation/number-currency-formats/other-patterns.md index 4dedb160ac8..ba05d3e13ed 100644 --- a/docs/site/translation/number-currency-formats/other-patterns.md +++ b/docs/site/translation/number-currency-formats/other-patterns.md @@ -10,4 +10,3 @@ The atLeast pattern is used to indicate a number that falls within a range with The **range** pattern is used to indicate a range of numbers. -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/translation/time-zones-and-city-names.md b/docs/site/translation/time-zones-and-city-names.md index 5c80103d9ac..2a08b5ab3b2 100644 --- a/docs/site/translation/time-zones-and-city-names.md +++ b/docs/site/translation/time-zones-and-city-names.md @@ -76,4 +76,3 @@ The city name may also be used in formatted times, such as: City names must be unique. See [Country/Region Names](https://cldr.unicode.org/translation/displaynames/countryregion-territory-names) for techniques. -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/translation/transforms.md b/docs/site/translation/transforms.md index 3bbd0f3f21d..6dc7f9194a6 100644 --- a/docs/site/translation/transforms.md +++ b/docs/site/translation/transforms.md @@ -14,7 +14,7 @@ Transforms describe ways of converting text. Most often these are transliteratio For those, the name of the language or script is used. -There are a few others that have special purposes, listed below. Note that whatever translation is used, it should be short (a few words at most). +There are a few others that have special purposes, listed below. Note that whatever translation is used, it should be short (a few words at most). For the specialized acronyms (marked with \*): @@ -34,4 +34,3 @@ For the specialized acronyms (marked with \*): | Fullwidth | Full-width or "wide" characters, such as A and ォ | | Halfwidth | Half-width or "narrow" characters, such as A and ォ | -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/translation/translation-guide-general.md b/docs/site/translation/translation-guide-general.md index 2fdc19abe20..ccc16026f24 100644 --- a/docs/site/translation/translation-guide-general.md +++ b/docs/site/translation/translation-guide-general.md @@ -8,4 +8,4 @@ title: General translation Guide - [Default Content](https://cldr.unicode.org/translation/translation-guide-general/default-content) - [References](https://cldr.unicode.org/translation/translation-guide-general/references) -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) + diff --git a/docs/site/translation/translation-guide-general/capitalization.md b/docs/site/translation/translation-guide-general/capitalization.md index 965777ded99..dfcc1d6ae7f 100644 --- a/docs/site/translation/translation-guide-general/capitalization.md +++ b/docs/site/translation/translation-guide-general/capitalization.md @@ -29,4 +29,3 @@ However, it is also important to ensure that there is consistent casing for all To provide warnings when the capitalization of an item differs from what is intended for items in a given category, the Survey Tool now checks capitalization of items against the \ within the \ element; data for this comes from xml files in the CLDR common/casing/ directory. This data cannot be changed using the Survey Tool; if it is incorrect, please file a bug (initial data was created based on the predominant capitalization of items in each category within a locale, and may be wrong). -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/translation/translation-guide-general/default-content.md b/docs/site/translation/translation-guide-general/default-content.md index 07ac69320c0..684fbe8e935 100644 --- a/docs/site/translation/translation-guide-general/default-content.md +++ b/docs/site/translation/translation-guide-general/default-content.md @@ -4,11 +4,11 @@ title: Default Content # Default Content -Locales are primarily identified by their ***base*** language. For example, English \[en], Arabic \[ar] or German \[de]. +Locales are primarily identified by their ***base*** language. For example, English \[en], Arabic \[ar] or German \[de]. We also label scripts explicitly, where a language is typically written in multiple scripts, such as Cyrillic or Latin. For example, Serbian (Cyrillic) \[sr\_Cyrl] and Serbian (Latin) \[sr\_Latn]. -Each language \+ script combination is treated as a unit. (i.e. People do not mix different script in the same data set.) +Each language \+ script combination is treated as a unit. (i.e. People do not mix different script in the same data set.) If a language is ***not*** typically written in multiple scripts, then the script sub\-tag is omitted. For example, en\_US or ko\_KR. @@ -35,4 +35,3 @@ For example: - Spanish (Mexico) \[es\_MX] differences from Spanish (Latin America) \[es\_419] - Arabic (Egypt) \[ar\_EG] that are different from Arabic (World) \[ar\_001] -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/translation/translation-guide-general/references.md b/docs/site/translation/translation-guide-general/references.md index 78b38bfe7eb..39d114b9ac0 100644 --- a/docs/site/translation/translation-guide-general/references.md +++ b/docs/site/translation/translation-guide-general/references.md @@ -52,7 +52,7 @@ For other languages, there should be similar guides for major publications. ### Collation - http://www.omniglot.com/writing/ - - http://www.alphabets-world.com/ + - http://www.alphabets-world.com/ - https://developer.mimer.com ### Dates and Times @@ -70,4 +70,3 @@ For other languages, there should be similar guides for major publications. - [ISO\-15915 (Kannada)](http://ee.www.ee/transliteration/pdf/Kannada.pdf) - [ISCII\-91](http://www.cdacindia.com/html/gist/down/iscii_d.asp) -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/translation/unique-translations.md b/docs/site/translation/unique-translations.md index 1ccb954f78d..154824114ca 100644 --- a/docs/site/translation/unique-translations.md +++ b/docs/site/translation/unique-translations.md @@ -10,4 +10,3 @@ See the following based on the type of translation - [Unique Emoji and Symbol Names](https://cldr.unicode.org/translation/characters/short-names-and-keywords) - [Unique Names](https://cldr.unicode.org/translation/displaynames/countryregion-territory-names#h.xkmg2o42dw29) (This is on the Country/Region page, but is the most general discussion). -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/translation/units.md b/docs/site/translation/units.md index fe52a6e2b1e..10ee380b396 100644 --- a/docs/site/translation/units.md +++ b/docs/site/translation/units.md @@ -7,4 +7,3 @@ title: Units - [Measurement systems](https://cldr.unicode.org/translation/units/measurement-systems) - [Unit Names and Patterns](https://cldr.unicode.org/translation/units/unit-names-and-patterns) -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file diff --git a/docs/site/translation/units/measurement-systems.md b/docs/site/translation/units/measurement-systems.md index 89f1e6c4cca..9d9c2d50fdc 100644 --- a/docs/site/translation/units/measurement-systems.md +++ b/docs/site/translation/units/measurement-systems.md @@ -14,5 +14,4 @@ There are special versions of "Yes" and "No" used for POSIX. Please supply the w | US | US | The measurement system used in the US, with miles, feet, etc. A gallon is approximately 3.79 liters. | | UK | UK | The measurement system traditionally used in the UK, with miles, feet, etc. A gallon is the imperial gallon, approximately 4.55 liters. | - -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file + diff --git a/docs/site/translation/units/unit-names-and-patterns.md b/docs/site/translation/units/unit-names-and-patterns.md index 147d27b9f03..6978dd934df 100644 --- a/docs/site/translation/units/unit-names-and-patterns.md +++ b/docs/site/translation/units/unit-names-and-patterns.md @@ -73,11 +73,11 @@ In many languages, the "per Y" part is inflected, and the dividing unit can't be ### Long Power -If your language is inflected for case or gender: +If your language is inflected for case or gender: - **No inflected alternatives.** If it doesn't list inflected alternatives for square or cubic yet, choose the most neutral form inflection. For many locales, an abbreviated form may work the best, so that there is no visible inflection. - **Inflected alternatives.** If it does list inflected alternatives, you should look at some of the compound units with "square" and "cubic" that are already translated, to see how to translate power2 and power3. For example, for English, we see - - Length / Kilometer / long-other => {0} kilometers + - Length / Kilometer / long-other => {0} kilometers - Area / Square-Kilometer / long-other => {0} square kilometers - The pattern for power2 should be constructed so that if you take the word for "kilometers" and substituted it into the pattern, you get "square kilometers". So let's take an example from French: @@ -86,11 +86,11 @@ If your language is inflected for case or gender: - So the appropriate pattern for power2 would be: 3. [https://st.unicode.org/cldr-apps/v#/fr/CompoundUnits/15b049cba8052719](https://st.unicode.org/cldr-apps/v#/fr/CompoundUnits/15b049cba8052719) => **{0} carrés** -- If we were to substitute "kilomètres" from the pattern in #1 into the pattern in #3, we would get "kilomètres carrés", which appears in pattern #2. +- If we were to substitute "kilomètres" from the pattern in #1 into the pattern in #3, we would get "kilomètres carrés", which appears in pattern #2. ### Fallback Format: two units -Some units are formed by combining other units. The most common of this is X per Y, such as "miles per hour". There is a "per" pattern that is used for this. For example, "{0} per {1}" might get replaced by "*10 meters* **per** *second*". +Some units are formed by combining other units. The most common of this is X per Y, such as "miles per hour". There is a "per" pattern that is used for this. For example, "{0} per {1}" might get replaced by "*10 meters* **per** *second*". Given an amount, and two units, the process uses the available patterns to put together a result, as described on [perUnitPatterns](http://www.unicode.org/reports/tr35/tr35-general.html#perUnitPatterns). (e.g. "3 kilograms" + "{0} per second" → "3 kilograms per second") @@ -114,7 +114,7 @@ The measurements for *points, dots,* and *pixels* may be confusing. A *point* is If the natural word for both "point" and "dot" is the same, such as *punkt*, then there are a few different options to solve the conflict. Italic will be used for native words. -**Changing the name for *point*.** +**Changing the name for *point*.** 1. Use the equivalent of “*punkt length*” in your language for **point**. 2. Use the equivalent of “*typographic punkt*” in your language for **point**. @@ -129,4 +129,3 @@ If the natural word for both "point" and "dot" is the same, such as *punkt*, the A few languages have special words for **year, month, week,** or **day** when they are used in context of a person's age. Other languages may simply use the same terms for each one, and do not require separate translation. -![Unicode copyright](https://www.unicode.org/img/hb_notice.gif) \ No newline at end of file