Skip to content

Commit

Permalink
Merge remote-tracking branch 'la-vache/main' into unihan-17
Browse files Browse the repository at this point in the history
  • Loading branch information
eggrobin committed Nov 16, 2024
2 parents b8d3063 + 57a3085 commit 825713f
Show file tree
Hide file tree
Showing 55 changed files with 3,333 additions and 1,586 deletions.
4 changes: 4 additions & 0 deletions .github/workflows/cache_retain.yml
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,10 @@ jobs:
retain-maven-cache:
name: Run all tests with Maven
runs-on: ubuntu-latest
# Only run this on the upstream repo. Otherwise, running in a personal fork will cause
# Github to disable the personal fork copy of the workflow
# (Github complains about running a scheduled workflow on a repo with > 60 days of inactivity)
if: github.ref == 'refs/heads/main' && github.repository == 'unicode-org/unicodetools'
steps:
- name: Checkout and setup
uses: actions/checkout@v2
Expand Down
3 changes: 3 additions & 0 deletions unicodetools/data/ucd/dev/ArabicShaping.txt
Original file line number Diff line number Diff line change
Expand Up @@ -482,6 +482,7 @@
088C; TAH WITH 3 DOTS BELOW; D; TAH
088D; KEHEH WITH VERTICAL 2 DOTS BELOW; D; GAF
088E; VERTICAL TAIL; R; VERTICAL TAIL
088F; DOTLESS NOON WITH SEPARATE RING ABOVE; D; NOON
0890; ARABIC POUND MARK ABOVE; U; No_Joining_Group
0891; ARABIC PIASTRE MARK ABOVE; U; No_Joining_Group

Expand Down Expand Up @@ -850,6 +851,8 @@ A873; PHAGS-PA CANDRABINDU; U; No_Joining_Group
10EC2; DAL WITH VERTICAL 2 DOTS BELOW; R; DAL
10EC3; TAH WITH VERTICAL 2 DOTS BELOW; D; TAH
10EC4; KAF WITH VERTICAL 2 DOTS BELOW; D; KAF
10EC6; THIN NOON; D; THIN NOON
10EC7; DOTLESS YEH WITH 4 DOTS BELOW; D; YEH

# Sogdian Characters

Expand Down
8 changes: 8 additions & 0 deletions unicodetools/data/ucd/dev/Blocks.txt
Original file line number Diff line number Diff line change
Expand Up @@ -228,6 +228,7 @@ FFF0..FFFF; Specials
108E0..108FF; Hatran
10900..1091F; Phoenician
10920..1093F; Lydian
10940..1095C; Sidetic
10980..1099F; Meroitic Hieroglyphs
109A0..109FF; Meroitic Cursive
10A00..10A5F; Kharoshthi
Expand Down Expand Up @@ -279,11 +280,13 @@ FFF0..FFFF; Specials
11AB0..11ABF; Unified Canadian Aboriginal Syllabics Extended-A
11AC0..11AFF; Pau Cin Hau
11B00..11B5F; Devanagari Extended-A
11B60..11B7F; Sharada Supplement
11BC0..11BFF; Sunuwar
11C00..11C6F; Bhaiksuki
11C70..11CBF; Marchen
11D00..11D5F; Masaram Gondi
11D60..11DAF; Gunjala Gondi
11DB0..11DEF; Tolong Siki
11EE0..11EFF; Makasar
11F00..11F5F; Kawi
11FB0..11FBF; Lisu Supplement
Expand All @@ -302,14 +305,17 @@ FFF0..FFFF; Specials
16A70..16ACF; Tangsa
16AD0..16AFF; Bassa Vah
16B00..16B8F; Pahawh Hmong
16EA0..16EDF; Beria Erfe
16D40..16D7F; Kirat Rai
16D80..16DAF; Chisoi
16E40..16E9F; Medefaidrin
16F00..16F9F; Miao
16FE0..16FFF; Ideographic Symbols and Punctuation
17000..187FF; Tangut
18800..18AFF; Tangut Components
18B00..18CFF; Khitan Small Script
18D00..18D7F; Tangut Supplement
18D80..18DFF; Tangut Components Supplement
1AFF0..1AFFF; Kana Extended-B
1B000..1B0FF; Kana Supplement
1B100..1B12F; Kana Extended-A
Expand All @@ -318,6 +324,7 @@ FFF0..FFFF; Specials
1BC00..1BC9F; Duployan
1BCA0..1BCAF; Shorthand Format Controls
1CC00..1CEBF; Symbols for Legacy Computing Supplement
1CEC0..1CEFF; Miscellaneous Symbols Supplement
1CF00..1CFCF; Znamenny Musical Notation
1D000..1D0FF; Byzantine Musical Symbols
1D100..1D1FF; Musical Symbols
Expand All @@ -336,6 +343,7 @@ FFF0..FFFF; Specials
1E2C0..1E2FF; Wancho
1E4D0..1E4FF; Nag Mundari
1E5D0..1E5FF; Ol Onal
1E6C0..1E6FF; Tai Yo
1E7E0..1E7FF; Ethiopic Extended-B
1E800..1E8DF; Mende Kikakui
1E900..1E95F; Adlam
Expand Down
32 changes: 30 additions & 2 deletions unicodetools/data/ucd/dev/CaseFolding.txt
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# CaseFolding-16.0.0.txt
# Date: 2024-04-30, 21:48:11 GMT
# CaseFolding-17.0.0.txt
# Date: 2024-11-14, 20:19:39 GMT
# © 2024 Unicode®, Inc.
# Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries.
# For terms of use and license, see https://www.unicode.org/terms_of_use.html
Expand Down Expand Up @@ -1243,7 +1243,10 @@ A7C7; C; A7C8; # LATIN CAPITAL LETTER D WITH SHORT STROKE OVERLAY
A7C9; C; A7CA; # LATIN CAPITAL LETTER S WITH SHORT STROKE OVERLAY
A7CB; C; 0264; # LATIN CAPITAL LETTER RAMS HORN
A7CC; C; A7CD; # LATIN CAPITAL LETTER S WITH DIAGONAL STROKE
A7CE; C; A7CF; # LATIN CAPITAL LETTER PHARYNGEAL VOICED FRICATIVE
A7D0; C; A7D1; # LATIN CAPITAL LETTER CLOSED INSULAR G
A7D2; C; A7D3; # LATIN CAPITAL LETTER DOUBLE THORN
A7D4; C; A7D5; # LATIN CAPITAL LETTER DOUBLE WYNN
A7D6; C; A7D7; # LATIN CAPITAL LETTER MIDDLE SCOTS S
A7D8; C; A7D9; # LATIN CAPITAL LETTER SIGMOID S
A7DA; C; A7DB; # LATIN CAPITAL LETTER LAMBDA
Expand Down Expand Up @@ -1616,6 +1619,31 @@ FF3A; C; FF5A; # FULLWIDTH LATIN CAPITAL LETTER Z
16E5D; C; 16E7D; # MEDEFAIDRIN CAPITAL LETTER O
16E5E; C; 16E7E; # MEDEFAIDRIN CAPITAL LETTER AI
16E5F; C; 16E7F; # MEDEFAIDRIN CAPITAL LETTER Y
16EA0; C; 16EBB; # BERIA ERFE CAPITAL LETTER ARKAB
16EA1; C; 16EBC; # BERIA ERFE CAPITAL LETTER BASIGNA
16EA2; C; 16EBD; # BERIA ERFE CAPITAL LETTER DARBAI
16EA3; C; 16EBE; # BERIA ERFE CAPITAL LETTER EH
16EA4; C; 16EBF; # BERIA ERFE CAPITAL LETTER FITKO
16EA5; C; 16EC0; # BERIA ERFE CAPITAL LETTER GOWAY
16EA6; C; 16EC1; # BERIA ERFE CAPITAL LETTER HIRDEABO
16EA7; C; 16EC2; # BERIA ERFE CAPITAL LETTER I
16EA8; C; 16EC3; # BERIA ERFE CAPITAL LETTER DJAI
16EA9; C; 16EC4; # BERIA ERFE CAPITAL LETTER KOBO
16EAA; C; 16EC5; # BERIA ERFE CAPITAL LETTER LAKKO
16EAB; C; 16EC6; # BERIA ERFE CAPITAL LETTER MERI
16EAC; C; 16EC7; # BERIA ERFE CAPITAL LETTER NINI
16EAD; C; 16EC8; # BERIA ERFE CAPITAL LETTER GNA
16EAE; C; 16EC9; # BERIA ERFE CAPITAL LETTER NGAY
16EAF; C; 16ECA; # BERIA ERFE CAPITAL LETTER OI
16EB0; C; 16ECB; # BERIA ERFE CAPITAL LETTER PI
16EB1; C; 16ECC; # BERIA ERFE CAPITAL LETTER ERIGO
16EB2; C; 16ECD; # BERIA ERFE CAPITAL LETTER ERIGO TAMURA
16EB3; C; 16ECE; # BERIA ERFE CAPITAL LETTER SERI
16EB4; C; 16ECF; # BERIA ERFE CAPITAL LETTER SHEP
16EB5; C; 16ED0; # BERIA ERFE CAPITAL LETTER TATASOUE
16EB6; C; 16ED1; # BERIA ERFE CAPITAL LETTER UI
16EB7; C; 16ED2; # BERIA ERFE CAPITAL LETTER WASSE
16EB8; C; 16ED3; # BERIA ERFE CAPITAL LETTER AY
1E900; C; 1E922; # ADLAM CAPITAL LETTER ALIF
1E901; C; 1E923; # ADLAM CAPITAL LETTER DAALI
1E902; C; 1E924; # ADLAM CAPITAL LETTER LAAM
Expand Down
56 changes: 51 additions & 5 deletions unicodetools/data/ucd/dev/DerivedAge.txt
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# DerivedAge-17.0.0.txt

Check failure on line 1 in unicodetools/data/ucd/dev/DerivedAge.txt

View workflow job for this annotation

GitHub Actions / Check UCD consistency, invariants, smoke-test generators

File must be regenerated

Run org.unicode.text.UCD.Main build MakeUnicodeFiles and copy any changed files to unicodetools/data/ucd/dev.

Check failure on line 1 in unicodetools/data/ucd/dev/DerivedAge.txt

View workflow job for this annotation

GitHub Actions / Check UCD consistency, invariants, smoke-test generators

File must be regenerated

Run org.unicode.text.UCD.Main build MakeUnicodeFiles and copy any changed files to unicodetools/data/ucd/dev.
# Date: 2024-11-14, 15:47:44 GMT
# Date: 2024-11-16, 02:24:30 GMT
# © 2024 Unicode®, Inc.
# Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries.
# For terms of use and license, see https://www.unicode.org/terms_of_use.html
Expand Down Expand Up @@ -2065,9 +2065,55 @@ A7DA..A7DC ; 16.0 # [3] LATIN CAPITAL LETTER LAMBDA..LATIN CAPITAL LETTER L

# Newly assigned in Unicode 17.0.0 (September, 2025)

2B73A..2B73E ; 17.0 # [5] CJK UNIFIED IDEOGRAPH-2B73A..CJK UNIFIED IDEOGRAPH-2B73E
323B0..33479 ; 17.0 # [4298] CJK UNIFIED IDEOGRAPH-323B0..CJK UNIFIED IDEOGRAPH-33479

# Total code points: 4303
088F ; 17.0 # ARABIC LETTER NOON WITH RING ABOVE
09FF ; 17.0 # BENGALI LETTER SANSKRIT BA
0B53..0B54 ; 17.0 # [2] ORIYA SIGN DOT ABOVE..ORIYA SIGN DOUBLE DOT ABOVE
0C5C ; 17.0 # TELUGU ARCHAIC SHRII
0CDC ; 17.0 # KANNADA ARCHAIC SHRII
1ACF..1ADD ; 17.0 # [15] COMBINING DOUBLE CARON..COMBINING DOT-AND-RING BELOW
1AE0..1AEB ; 17.0 # [12] COMBINING LEFT TACK ABOVE..COMBINING DOUBLE RIGHTWARDS ARROW ABOVE
2B96 ; 17.0 # EQUALS SIGN WITH INFINITY ABOVE
A7CE..A7CF ; 17.0 # [2] LATIN CAPITAL LETTER PHARYNGEAL VOICED FRICATIVE..LATIN SMALL LETTER PHARYNGEAL VOICED FRICATIVE
A7D2 ; 17.0 # LATIN CAPITAL LETTER DOUBLE THORN
A7D4 ; 17.0 # LATIN CAPITAL LETTER DOUBLE WYNN
A7F1 ; 17.0 # MODIFIER LETTER CAPITAL S
FBC3..FBD2 ; 17.0 # [16] ARABIC LIGATURE JALLA WA-ALAA..ARABIC LIGATURE ALAYHI AR-RAHMAH
FD90..FD91 ; 17.0 # [2] ARABIC LIGATURE RAHMATU ALLAAHI ALAYH..ARABIC LIGATURE RAHMATU ALLAAHI ALAYHAA
FDC8..FDCE ; 17.0 # [7] ARABIC LIGATURE RAHIMAHU ALLAAH TAAALAA..ARABIC LIGATURE KARRAMA ALLAAHU WAJHAH
10940..1095C ; 17.0 # [29] SIDETIC LETTER N01..SIDETIC LETTER N29
10EC5..10EC7 ; 17.0 # [3] ARABIC SMALL YEH BARREE WITH TWO DOTS BELOW..ARABIC LETTER YEH WITH FOUR DOTS BELOW
10ED0..10ED8 ; 17.0 # [9] ARABIC BIBLICAL END OF VERSE..ARABIC LIGATURE NAWWARA ALLAAHU MARQADAH
10EFA..10EFB ; 17.0 # [2] ARABIC DOUBLE VERTICAL BAR BELOW..ARABIC SMALL LOW NOON
11B60..11B67 ; 17.0 # [8] SHARADA VOWEL SIGN OE..SHARADA VOWEL SIGN CANDRA O
11DB0..11DDB ; 17.0 # [44] TOLONG SIKI LETTER I..TOLONG SIKI UNGGA
11DE0..11DE9 ; 17.0 # [10] TOLONG SIKI DIGIT ZERO..TOLONG SIKI DIGIT NINE
16D80..16D9D ; 17.0 # [30] CHISOI LETTER A..CHISOI SIGN SISO
16DA0..16DA9 ; 17.0 # [10] CHISOI DIGIT ZERO..CHISOI DIGIT NINE
16EA0..16EB8 ; 17.0 # [25] BERIA ERFE CAPITAL LETTER ARKAB..BERIA ERFE CAPITAL LETTER AY
16EBB..16ED3 ; 17.0 # [25] BERIA ERFE SMALL LETTER ARKAB..BERIA ERFE SMALL LETTER AY
16FF2..16FF6 ; 17.0 # [5] CHINESE SMALL SIMPLIFIED ER..YANGQIN SIGN SLOW TWO BEATS
187F8..187FF ; 17.0 # [8] TANGUT IDEOGRAPH-187F8..TANGUT IDEOGRAPH-187FF
18D09..18D1E ; 17.0 # [22] TANGUT IDEOGRAPH-18D09..TANGUT IDEOGRAPH-18D1E
18D80..18DF2 ; 17.0 # [115] TANGUT COMPONENT-769..TANGUT COMPONENT-883
1CCFA..1CCFC ; 17.0 # [3] SNAKE SYMBOL..NOSE SYMBOL
1CEBA..1CED0 ; 17.0 # [23] FRAGILE SYMBOL..LEUKOTHEA
1CEE0..1CEF0 ; 17.0 # [17] GEOMANTIC FIGURE POPULUS..MEDIUM SMALL WHITE CIRCLE WITH HORIZONTAL BAR
1E6C0..1E6DE ; 17.0 # [31] TAI YO LETTER LOW KO..TAI YO LETTER HIGH KVO
1E6E0..1E6F5 ; 17.0 # [22] TAI YO LETTER AA..TAI YO SIGN OM
1E6FE..1E6FF ; 17.0 # [2] TAI YO SYMBOL MUEANG..TAI YO XAM LAI
1F6D8 ; 17.0 # LANDSLIDE
1F777..1F77A ; 17.0 # [4] VESTA FORM TWO..PARTHENOPE FORM TWO
1F8D0..1F8D8 ; 17.0 # [9] LONG RIGHTWARDS ARROW OVER LONG LEFTWARDS ARROW..LONG LEFT RIGHT ARROW WITH DEPENDENT LOBE
1FA54..1FA57 ; 17.0 # [4] WHITE CHESS FERZ..BLACK CHESS ALFIL
1FA8A ; 17.0 # TROMBONE
1FA8E ; 17.0 # TREASURE CHEST
1FAC8 ; 17.0 # HAIRY CREATURE
1FACD ; 17.0 # ORCA
1FADD ; 17.0 # APPLE CORE
1FAEA ; 17.0 # DISTORTED FACE
1FAEF ; 17.0 # FIGHT CLOUD
1FBFA ; 17.0 # ALARM BELL SYMBOL

# Total code points: 533

# EOF
Loading

0 comments on commit 825713f

Please sign in to comment.