Skip to content

Commit

Permalink
CLDR-17423 Placeholder for MessageFormat 2.0 spec, part 9
Browse files Browse the repository at this point in the history
- final spec will be poured in as CLDR-17424
  • Loading branch information
srl295 committed Feb 28, 2024
1 parent 8ec0988 commit cf96b52
Show file tree
Hide file tree
Showing 9 changed files with 84 additions and 7 deletions.
1 change: 1 addition & 0 deletions docs/ldml/tr35-collation.md
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,7 @@ The LDML specification is divided into the following parts:
* Part 6: [Supplemental](tr35-info.md#Contents) (supplemental data)
* Part 7: [Keyboards](tr35-keyboards.md#Contents) (keyboard mappings)
* Part 8: [Person Names](tr35-personNames.md#Contents) (person names)
* Part 9: [Message Format](tr35-messageFormat.md#Contents) (message format)

## <a name="Contents" href="#Contents">Contents of Part 5, Collation</a>

Expand Down
1 change: 1 addition & 0 deletions docs/ldml/tr35-dates.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,7 @@ The LDML specification is divided into the following parts:
* Part 6: [Supplemental](tr35-info.md#Contents) (supplemental data)
* Part 7: [Keyboards](tr35-keyboards.md#Contents) (keyboard mappings)
* Part 8: [Person Names](tr35-personNames.md#Contents) (person names)
* Part 9: [Message Format](tr35-messageFormat.md#Contents) (message format)

## <a name="Contents" href="#Contents">Contents of Part 4, Dates</a>

Expand Down
3 changes: 2 additions & 1 deletion docs/ldml/tr35-general.md
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,7 @@ The LDML specification is divided into the following parts:
* Part 6: [Supplemental](tr35-info.md#Contents) (supplemental data)
* Part 7: [Keyboards](tr35-keyboards.md#Contents) (keyboard mappings)
* Part 8: [Person Names](tr35-personNames.md#Contents) (person names)
* Part 9: [Message Format](tr35-messageFormat.md#Contents) (message format)

## <a name="Contents" href="#Contents">Contents of Part 2, General</a>

Expand Down Expand Up @@ -2643,7 +2644,7 @@ Many emoji are represented by sequences of characters. When there are no `annota
1. If **sequence** is an **emoji flag sequence**, look up the territory name in CLDR for the corresponding ASCII characters and return as the short name. For example, the regional indicator symbols P+F would map to “Französisch-Polynesien” in German.
2. If **sequence** is an **emoji tag sequence**, look up the subdivision name in CLDR for the corresponding ASCII characters and return as the short name. For example, the TAG characters gbsct would map to “Schottland” in German.
3. If **sequence** is a keycap sequence or 🔟, use the characterLabel for "keycap" as the **prefixName** and set the **suffix** to be the sequence (or "10" in the case of 🔟), then go to step 8.
4. If the **sequence** ends with the string ZWJ + ➡️, look up the name of that sequence with that string removed. Embed that name into the "facing-right" characterLabelPattern and return it.
4. If the **sequence** ends with the string ZWJ + ➡️, look up the name of that sequence with that string removed. Embed that name into the "facing-right" characterLabelPattern and return it.
5. Let **suffix** and **prefixName** be "".
6. If **sequence** contains any emoji modifiers, move them (in order) into **suffix**, removing them from **sequence**.
7. If **sequence** is a "KISS", "HEART", "FAMILY", or "HOLDING HANDS" emoji ZWJ sequence, move the characters in **sequence** to the front of **suffix**, and set the **sequence** to be "💏", "💑", or "👪" respectively, and go to step 7.
Expand Down
1 change: 1 addition & 0 deletions docs/ldml/tr35-info.md
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,7 @@ The LDML specification is divided into the following parts:
* Part 6: [Supplemental](tr35-info.md#Contents) (supplemental data)
* Part 7: [Keyboards](tr35-keyboards.md#Contents) (keyboard mappings)
* Part 8: [Person Names](tr35-personNames.md#Contents) (person names)
* Part 9: [Message Format](tr35-messageFormat.md#Contents) (message format)

## <a name="Contents" href="#Contents">Contents of Part 6, Supplemental</a>

Expand Down
1 change: 1 addition & 0 deletions docs/ldml/tr35-keyboards.md
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,7 @@ The LDML specification is divided into the following parts:
* Part 6: [Supplemental](tr35-info.md#Contents) (supplemental data)
* Part 7: [Keyboards](tr35-keyboards.md#Contents) (keyboard mappings)
* Part 8: [Person Names](tr35-personNames.md#Contents) (person names)
* Part 9: [Message Format](tr35-messageFormat.md#Contents) (message format)

## <a name="Contents" href="#Contents">Contents of Part 7, Keyboards</a>

Expand Down
69 changes: 69 additions & 0 deletions docs/ldml/tr35-messageFormat.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
## Unicode Technical Standard #35

# Unicode Locale Data Markup Language (LDML)<br/>Part 9: Message Format

|Version|45 (draft) |
|-------|------------------------|
|Editors|Addison Phillips and [other CLDR committee members](tr35.md#Acknowledgments)|

For the full header, summary, and status, see [Part 1: Core](tr35.md).

### _Summary_

This specification defines the data model, syntax, processing, and conformance requirements for the next generation of dynamic messages.

This is a partial document, describing only those parts of the LDML that are relevant for message format. For the other parts of the LDML see the [main LDML document](tr35.md) and the links above.

### _Status_

_This is a draft document which may be updated, replaced, or superseded by other documents at any time.
Publication does not imply endorsement by the Unicode Consortium.
This is not a stable document; it is inappropriate to cite this document as other than a work in progress._

<!-- _This document has been reviewed by Unicode members and other interested parties, and has been approved for publication by the Unicode Consortium.
This is a stable document and may be used as reference material or cited as a normative reference by other specifications._ -->

> _**A Unicode Technical Standard (UTS)** is an independent specification. Conformance to the Unicode Standard does not imply conformance to any UTS._
_Please submit corrigenda and other comments with the CLDR bug reporting form [[Bugs](tr35.md#Bugs)]. Related information that is useful in understanding this document is found in the [References](tr35.md#References). For the latest version of the Unicode Standard see [[Unicode](tr35.md#Unicode)]. For a list of current Unicode Technical Reports see [[Reports](tr35.md#Reports)]. For more information about versions of the Unicode Standard, see [[Versions](tr35.md#Versions)]._

## Parts

The LDML specification is divided into the following parts:

* Part 1: [Core](tr35.md#Contents) (languages, locales, basic structure)
* Part 2: [General](tr35-general.md#Contents) (display names & transforms, etc.)
* Part 3: [Numbers](tr35-numbers.md#Contents) (number & currency formatting)
* Part 4: [Dates](tr35-dates.md#Contents) (date, time, time zone formatting)
* Part 5: [Collation](tr35-collation.md#Contents) (sorting, searching, grouping)
* Part 6: [Supplemental](tr35-info.md#Contents) (supplemental data)
* Part 7: [Keyboards](tr35-keyboards.md#Contents) (keyboard mappings)
* Part 8: [Person Names](tr35-personNames.md#Contents) (person names)
* Part 9: [Message Format](tr35-messageFormat.md#Contents) (message format)

## <a name="Contents">Contents of Part 9, Message Format</a>

* [CLDR Message Format](#cldr-message-format)
* [Introduction](#introduction)
* [Status](#status)

## CLDR Message Format

### Introduction

This specification defines the data model, syntax, processing, and conformance requirements for the next generation of dynamic messages. It is intended for adoption by programming languages and APIs. This will enable the integration of existing internationalization APIs (such as the date and number formats shown above), grammatical matching (such as plurals or genders), as well as user-defined formats and message selectors.

### Status

The Message Format 2.0 Specification has been approved by the CLDR-TC for inclusion in CLDR version 45.
The specification will be included in this page prior to release.

In the interim, the current draft specification may be accessed at:

<https://github.com/unicode-org/message-format-wg/blob/LDML45-alpha/spec/README.md>

* * *

Copyright © 2001–2024 Unicode, Inc. All Rights Reserved. The Unicode Consortium makes no expressed or implied warranty of any kind, and assumes no liability for errors or omissions. No liability is assumed for incidental and consequential damages in connection with or arising out of the use of the information or programs contained or accompanying this technical report. The Unicode [Terms of Use](https://www.unicode.org/copyright.html) apply.

Unicode and the Unicode logo are trademarks of Unicode, Inc., and are registered in some jurisdictions.
1 change: 1 addition & 0 deletions docs/ldml/tr35-numbers.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,7 @@ The LDML specification is divided into the following parts:
* Part 6: [Supplemental](tr35-info.md#Contents) (supplemental data)
* Part 7: [Keyboards](tr35-keyboards.md#Contents) (keyboard mappings)
* Part 8: [Person Names](tr35-personNames.md#Contents) (person names)
* Part 9: [Message Format](tr35-messageFormat.md#Contents) (message format)

## <a name="Contents" href="#Contents">Contents of Part 3, Numbers</a>

Expand Down
1 change: 1 addition & 0 deletions docs/ldml/tr35-personNames.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,7 @@ The LDML specification is divided into the following parts:
* Part 6: [Supplemental](tr35-info.md#Contents) (supplemental data)
* Part 7: [Keyboards](tr35-keyboards.md#Contents) (keyboard mappings)
* Part 8: [Person Names](tr35-personNames.md#Contents) (person names)
* Part 9: [Message Format](tr35-messageFormat.md#Contents) (message format)

## <a name="Contents">Contents of Part 8, Person Names</a>

Expand Down
13 changes: 7 additions & 6 deletions docs/ldml/tr35.md
Original file line number Diff line number Diff line change
Expand Up @@ -51,6 +51,7 @@ The LDML specification is divided into the following parts:
* Part 6: [Supplemental](tr35-info.md#Contents) (supplemental data)
* Part 7: [Keyboards](tr35-keyboards.md#Contents) (keyboard mappings)
* Part 8: [Person Names](tr35-personNames.md#Contents) (person names)
* Part 9: [Message Format](tr35-messageFormat.md#Contents) (message format)

## <a name="Contents" href="#Contents">Contents of Part 1, Core</a>

Expand Down Expand Up @@ -1771,7 +1772,7 @@ _Examples:_
When the component does not occur, that is referred to as the ‘main’ component.
Otherwise the component value typically corresponds to an element and its children, such as ‘collations’ or ‘plurals’.

The basic inheritance model for locales of the form <lang>_<script>_<region>_<variant1>_…<variantN> is to truncate from the end. That is,
The basic inheritance model for locales of the form <lang>_<script>_<region>_<variant1>_…<variantN> is to truncate from the end. That is,
remove the _u and _t extensions, then remove the last _ and following tag, then restore the extensions.

For example
Expand All @@ -1797,15 +1798,15 @@ The `parentLocale` element is used to override the normal inheritance when acces
For case 1, there is a special attribute and value, `localeRules="nonlikelyScript"`,
which specifies **all locales** of the form <lang>_<script>, wherever the <script> is **not** the likely script for <lang>.
For migration, the previous short list of locales (a subset of the nonlikelyScript locales) is retained,
but those locales are slated for removal in the future.
but those locales are slated for removal in the future.
For example, `ru_Latn` is not included in the short list but is included (programmatically) in the rule.

```xml
<parentLocale parent="root" localeRules="nonlikelyScript" locales="az_Arab az_Cyrl bal_Latn … yue_Hans zh_Hant"/>/>
```

The `localeRules` is used for the main component, for example.
It is not used to components where text is not mixed,
It is not used to components where text is not mixed,
such as the collations component or the plurals component.

For case 2, the children and parent share the same primary language, but the region is changed. For example:
Expand All @@ -1814,7 +1815,7 @@ For case 2, the children and parent share the same primary language, but the reg
<parentLocale parent="es_419" locales="es_AR es_BO … es_UY es_VE"/>
```

There are certain components that require addenda to the common parent fallback rules.
There are certain components that require addenda to the common parent fallback rules.
For a locale like `zh_Hant` in the example above, the `parentLocale` element would dictate the parent as `root` when referring to main locale data,
but for collation data, the parent locale should still be `zh`,
even though the `parentLocale` element is present for that locale.
Expand All @@ -1827,9 +1828,9 @@ To address this, components can have their own fallback rules that inherit from
```

Note: When components were first introduced, the component-specific parent locales were be merged with the main parent locales.
This was determined to be an error, and the component-specific parent locales are now not merged, but are treated as stand-alone.
This was determined to be an error, and the component-specific parent locales are now not merged, but are treated as stand-alone.

Since parentLocale information is not localizable on a per locale basis,
Since parentLocale information is not localizable on a per locale basis,
the parentLocale information is contained in CLDR’s [supplemental data.](tr35-info.md)

When a `parentLocale` element is used to override normal inheritance, the following guidelines apply in most cases:
Expand Down

0 comments on commit cf96b52

Please sign in to comment.