Skip to content

Commit

Permalink
A name-based invariant about Lowercase (#738)
Browse files Browse the repository at this point in the history
* An invariant about Lowercase

* Test the modifier letters too

* Use the decompositions

* bad parser

* that is just wrong.

* Tippfehler

Co-authored-by: Markus Scherer <[email protected]>

---------

Co-authored-by: Markus Scherer <[email protected]>
  • Loading branch information
eggrobin and markusicu authored Mar 13, 2024
1 parent 6f0c77d commit 2cbaf67
Showing 1 changed file with 10 additions and 0 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -586,12 +586,22 @@ Let $nonAlphabeticAvagrahas = [\N{TIBETAN MARK PALUTA}] # A punctuation mark.
[\p{InSC=Avagraha} - $nonAlphabeticAvagrahas] ⊆ \p{Alphabetic}

# Name-based checks.
Let $nonLowercaseSmallLetters = [ \p{name=/^LIMBU SMALL LETTER/} \N{TURNED GREEK SMALL LETTER IOTA} \p{name=/^(SQUARED|PARENTHESIZED|TAG) LATIN SMALL LETTER/} ]
Let $nonLowercaseSmallModifierLetters = [ \p{gc=Lm} & \p{name=/^ARABIC SMALL/} ]
[ \p{name=/\bSMALL LETTER\b/}-\p{gc=Mn}-\p{gc=Lt} - $nonLowercaseSmallLetters ] ⊆ \p{Lowercase}
[ [\p{gc=Lm} & \p{name=/SMALL/}] - $nonLowercaseSmallModifierLetters ] ⊆ \p{Lowercase}

# Combining letters are often alphabetic (medievalist abbreviations).
# The others are diacritic (cantillation marks, phonetics).
# See 177-C52.
\p{name=/COMBINING .* LETTER/} ⊆ [\p{Alphabetic}\p{Diacritic}]

## Consistency of Lowercase with decompositions.
# Note that the same is not true of Uppercase.
# A non-lowercase character has non-lowercase characters in its decomposition,
# or its decomposition is <square> (㋍ etc.).
In [\P{Lowercase} - \p{dt=square}], \p{Lowercase} * toNFKD ≠ toNFKD

## Joining_Type and Joining_Group
# Where defined, the Joining_Group refines the Joining_Type.
OnPairsOf \P{Joining_Group=No_Joining_Group}, EqualityOf Joining_Group ⇒ EqualityOf Joining_Type
Expand Down

0 comments on commit 2cbaf67

Please sign in to comment.