-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Can we replace unicode_norm.rs
with the unicode_norm crate?
#14
Comments
Yeah HarfBuzz needs the 1:2 decomposition, which some libraries don't expose. It would be easier to add it to the |
My plan here is to just use icu4x which already has the low level composition functions (seemingly added in anticipation of supporting HarfBuzz :) |
I think having an alternative to ICU would be nice, since that's a YUGE crate IIUC. |
No disagreement from me. One thing I’ve considered is adding a build script that pulls in the icu4x crates and extracts the necessary properties into a compact data structure. This would be a nice option for a standalone shaper for users who are not already consuming the icu4x crates. |
Or do what everyone else does and roll your own Python code to read the UCD data and spew out code. Given HB uses this: https://github.com/harfbuzz/harfbuzz/blob/main/src/gen-ucd-table.py and that mostly uses packTab to pack tables, and I've started adding Rust output to it: looks like you might get a replacement for free. |
We already have that, no? 😄 https://github.com/harfbuzz/harfruzz/blob/main/scripts/gen-unicode-norm-table.py Althought this one is not using packTab yet. |
My primary concern is that I’d like to avoid pulling in a bunch of arbitrary I’m 100% on board with bundling our own UCD data and I don’t have strong feelings on whether this is generated with rust or python. However, since Chrome (and the various Linebender projects) are planning on using icu4x for other things, it would be nice feature gate our bundled blobs and allow external implementations to avoid duplication. I suppose we just need HB style unicode funcs :) |
I've attempted to do this in rustybuzz before, and the reason why I didn't end up pursuing this idea further is that, from what I gathered, the
unicode_norm
crate always decomposes a character as much as possible, while in harfbuzz (and currently in rustybuzz), we have a decomposition table that always decomposes it into exactly two components.Not sure if that makes any difference in the end, but since rustybuzz should stay as similar to harfbuzz as possible, I didn't actually try it. Maybe we can try it for harfruzz, though?
The text was updated successfully, but these errors were encountered: