unicode-data-names
provides Haskell APIs to efficiently access the Unicode
character names and aliases from the
Unicode character database.
There are 3 APIs:
String
API: enabled by default.ByteString
API: enabled via the package flaghas-bytestring
.Text
API: enabled via the package flaghas-text
.
The Haskell data structures are generated programmatically from the
Unicode character database (UCD) files. The latest Unicode version
supported by this library is
15.1.0
.
Please see the Haddock documentation for reference documentation.
We can compare the implementation against ICU. This requires working with the
source repository, as we need the internal package icu
.
Warning: An ICU version with the exact same Unicode version is required.
cabal run -O2 --flag dev-has-icu unicode-data-names:tests -- -m ICU
In order to check Unicode implementation in Haskell, we compare the results obtained with Python.
Warning: A Python version with the exact same Unicode version is required.
cabal run -O2 -f "export-all-chars" -v0 export-all-chars > ./test/all_chars.csv
python3 ./test/check.py -v ./test/all_chars.csv
unicode-data-names
is an open source
project available under a liberal Apache-2.0 license.
As an open project we welcome contributions.