Skip to content

Commit

Permalink
make 113CF TULU-TIGALARI SIGN LOOPED VIRAMA a tertiary variant of 113…
Browse files Browse the repository at this point in the history
…CE TULU-TIGALARI SIGN VIRAMA

For PAG issue 192 "UCA 16: virama variants primary vs. tertiary"

From Ken:

UCA 16.0 delta 16

I've made the single point fix to DUCET to make 113CF TULU-TIGALARI SIGN
LOOPED VIRAMA a tertiary variant of 113CE TULU-TIGALARI SIGN VIRAMA.

The bad news is that I went looking for consistency in the other virama
characters in other scripts, and it turns out that will be very hard to
make completely consistent if the criterion is based on same or
different functions in IndicSyllabicCategory.txt.

Problem areas for Robin to muse over:

Malayalam: 0D4D is a virama, 0D3B, 0D3C are pure_killer. Right now 0D3B
and 0D3C are tertiary variants of 0D4D, which would not be consistent if
we are trying to give separate primary weights to viramas versus
pure_killer.

Tagalog has two pure killers: 1714, 1715. Currently they have separate
primary weights, which is inconsistent with the principle, but it seems
wrong to treat these two as presentation variants of each other.

Batak also has two pure killers: 1BF2, 1BF3. I can't figure out from the
documentation whether these are presentation variants of each other, or not.

Thai has two pure killers: 0E3A, 0E4E. The second is yamakkan, but the
corresponding Lao yamakkan is treated as a syllable_modifier, not a
pure_killer, so I'm not sure which way to go on that one.

The Kirat Rai virama and saat sound like they should be separate, but
both are classified as pure_killer, so it isn't clear what the right
answer is there, either.

Basically, I think looking for complete consistency here is going to
hurt everybody's brains, with minimal ROI. I suggest we just sweep the
rest under the rug, take the Tulu-Tigalari case as a win, and go home.
  • Loading branch information
markusicu committed Mar 29, 2024
1 parent 5fbc535 commit cd1c4de
Show file tree
Hide file tree
Showing 2 changed files with 18,387 additions and 18,387 deletions.
4 changes: 2 additions & 2 deletions c/uca/sifter/unidata.txt
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@
# Default Unicode Collation Element Table (DUCET) for
# the Unicode Collation Algorithm.
#
# Version 16.0.0 draft 14 (Unicode Version: 16.0.0)
# Version 16.0.0 draft 16 (Unicode Version: 16.0.0)
# based on Unicode data file UnicodeData-16.0.0d13.txt
# Ordering for Unicode 16.0
#
Expand Down Expand Up @@ -23119,7 +23119,7 @@ CONTRACTION
DEFAULT

113CE;TULU-TIGALARI SIGN VIRAMA;Mn;;;;;;
113CF;TULU-TIGALARI SIGN LOOPED VIRAMA;Mc;;;;;;
113CF;TULU-TIGALARI SIGN LOOPED VIRAMA;Mc;<sort> 113CE;;;;;
113D0;TULU-TIGALARI CONJOINER;Mn;;;;;;
113C9;TULU-TIGALARI AU LENGTH MARK;Mc;;;;;;
113D3;TULU-TIGALARI SIGN PLUTA;Lo;;;;;;
Expand Down
Loading

0 comments on commit cd1c4de

Please sign in to comment.