Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Diacritic folding for find bars #28

Open
1ec5 opened this issue Jan 28, 2014 · 7 comments
Open

Diacritic folding for find bars #28

1ec5 opened this issue Jan 28, 2014 · 7 comments
Assignees

Comments

@1ec5
Copy link
Owner

1ec5 commented Jan 28, 2014

Firefox introduced event-based extension hooks to the find bar so that pdf.js can search PDFs. It would be really neat if AVIM could customize in-page find to ignore diacritics until diacritics are added to the search terms. The find engine would probably involve querying for text nodes that match a certain regular expression.

@1ec5
Copy link
Owner Author

1ec5 commented Feb 9, 2014

The extension hooks assume you’ll handle everything (such as highlighting) yourself, so the code would be rather involved. I need to look at whether XUL/Migemo does something simpler.

@1ec5
Copy link
Owner Author

1ec5 commented Aug 30, 2014

piroor/xulmigemo isn’t exactly simple, but perhaps I can distill it to just the functionality needed for Vietnamese.

@1ec5 1ec5 self-assigned this Jan 2, 2016
@1ec5
Copy link
Owner Author

1ec5 commented Jan 3, 2016

I’m getting closer to a working implementation using a NodeIterator and a function that replaces each vowel with a character class that represents all its precomposed variants. Naturally, NodeIterator isn’t quite as fast as the native nsIFind implementation, but the performance hit is much less noticeable in e10s windows. Once this is done, I’d like to contribute something along these lines to FindBar Tweak to fix Quicksaver/FindBar-Tweak#56.

1ec5 added a commit that referenced this issue Jan 3, 2016
Find and find next/previous are missing. Highlighting doesn’t work inside frames and iframes, and selection can get out of sync inside editors after editing.

Working towards #28.
@1ec5
Copy link
Owner Author

1ec5 commented Jan 3, 2016

Work is continuing on the find-fold-28 branch. Highlighting is now diacritic-folded, but find and find previous/next are still unimplemented. There will also need to be UI to disable this feature, in case the user doesn’t want diacritic folding or is using a potentially incompatible extension like Migemo or FindBar Tweak.

@1ec5
Copy link
Owner Author

1ec5 commented Jan 4, 2016

Current status:

  • Find (selection + scrolling)
  • Find previous/next
  • Case sensitivity
  • Whole word (hidden preference – does it matter?)
  • Count matches
  • Highlight all
  • Editor observers, for removing highlights when text is edited
  • Quick Find (both / and ')
  • Match across inline tag boundaries (for example when part of the candidate string is in a <b> tag)
  • Make pdf.js fold diacritics too
  • Add UI for toggling diacritic folding
  • Conflate traditional and reformed diacritics (xóa versus xoá)
  • Conflate NFC and NFD (precomposed characters versus combining diacritics; ễ versus )

And of course, I just realized that there’s been movement on bug 202251 within the past few months. If Firefox gains built-in diacritic folding, all this work could be moot, and I can move on to #56 and maybe reuse the finder script for a regular expression find extension. 😂

But it all depends on whether it handles the case where ă should match but not a. WebKit and Chromium get this wrong, because they strip all diacritics from both source and query strings for comparison purposes, as the current patch in 202251 does. This comment indicates that Mozilla is at least aware of the need for more nuanced folding.

@1ec5
Copy link
Owner Author

1ec5 commented Jul 22, 2017

Bug 1,353,790 would provide a formal way for a WebExtensions-based addon to provide synonyms for searches instead of having to reinvent the find bar wheel.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant