Implementation considerations #4

nciric · 2024-02-28T19:39:27Z

While we aim at a unified API solution in #3 , we should also recognize that different companies could have partial and proprietary solutions they would like to reuse.

ICU already has prior art solving the problem, e.g. transliterator and break iterator.

There are two cases we should consider:

User is fine with defaults, uses code & lexicons provided by inflection project
User has a better implementation for a set of languages, and overrides defaults for those languages only, The rest falls back on inflection defaults. User solutions can range from pure lexicon lookup, heuristics to ML models for more complex cases.

Inflection code shouldn't depend on user libraries, but it should provide registration APIs where they can hook up their implementation to be used with our APIs.

grhoten · 2024-12-10T18:36:41Z

The SemanticFeatureModel of the code in pull request #35 does allow replacing an inflection engine on a per instance basis. It doesn't do a global registration. Doing a global default replace does require considerations for memory ownership, framework reuse, data reloading, thread safety, and competing implementations in the same process space.

From experience, the transliterator model is nice when you own everything in a single process space. As soon as you try global registration, other frameworks that don't want your default behavior nor for it to stick around after a language change, this model becomes hard to coordinate.

When implementing these considerations, security around who is able to intercept this content should be considered too. There may be sensitive data that goes through the inflection process. So how to handle logging the data or even a person in the middle attack should be considered.

Typically simple customizations should be handled with a SemanticConcept. That would address case 1, and that's a common case. Case 2 is a more complex consideration that requires much more thought.

nciric added the discuss Discussion item label Feb 28, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implementation considerations #4

Implementation considerations #4

nciric commented Feb 28, 2024

grhoten commented Dec 10, 2024

Implementation considerations #4

Implementation considerations #4

Comments

nciric commented Feb 28, 2024

grhoten commented Dec 10, 2024