Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial support for Unicode emojis #963

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

palant
Copy link

@palant palant commented Dec 2, 2024

This is not meant to be a complete fix for #131 and #400, I didn’t want this PR to get too complex. This is only a first step in the right direction.

This makes sure that data from the emojibase-data package is preprocessed and placed into the assets/emojis directory. There is one file for each locale, with some files being identical to English because emojibase-data doesn’t have translations for Czech for example.

The getCustomEmojis function has been modified to load both custom emojis and the locale-specific Unicode ones. The callers can then handle Unicode emojis with minimal changes.

Issues explicitly not addressed here:

  • Both code and UI presentation still speak about “custom” emojis. This can be addressed in a subsequent cleanup.
  • Quality of emojibase-data translation varies. For example, the gender neutral “scientist” is translated as “Wissenschaftler(in)” to German, the female “woman scientist” as “Wissenschaftlerin”. Both get the shortcode “wissenschaftlerin” – and Phanpy is using that shortcode as key, so React will complain about duplicate keys. Probably best to address this upstream, though some solution for duplicate shortcodes might also be in order.
  • Also given the varying quality of translations, searching both translated and English emoji descriptions/shortcodes might be a good idea.
  • I see tracking of recent emojis in the code but I don’t see it working in the UI. This code probably needs adapting as well.
  • Emoji variants with different skin tones are present in the data but cannot currently be selected in the UI.
  • Emoji subgroups are present in the data, not sure whether these need to be supported.
  • Depending on OS support, some emojis currently won’t display. It’s possible to detect unsupported emojis, a better idea might be using a custom emoji font however.

Copy link

socket-security bot commented Dec 2, 2024

New dependencies detected. Learn more about Socket for GitHub ↗︎

Package New capabilities Transitives Size Publisher
npm/[email protected] None 0 44.3 MB milesj

View full report↗︎

@palant
Copy link
Author

palant commented Dec 2, 2024

I tried running Prettier and got tons of changes, none of which are related to the code I touched. I guess I ignore this test failure then?

@cheeaun
Copy link
Owner

cheeaun commented Dec 3, 2024

@palant your prettier is probably running without following the repo's config? You can run it on specific files too.

Few notes:

  • Are there screenshots or some deployed preview? Would be interesting to see how this works. (I've tried to enable automatic PR/branch deployments from Cloudflare Pages on my side, but doesn't seem to work on PRs opened by others)
  • I prefer this to be separate from Custom Emojis. It's confusing enough to combine both of them, and their short codes could clash.
  • Recent emojis only work for clicked/selected ones.

To be honest, I'm not a fan of adding an emoji picker — I personally use the OS's native emoji picker (and 3rd-party native emoji picker apps) that works across all web sites and apps. Also, if I wanted to add one, probably could use this https://github.com/nolanlawson/emoji-picker-element but the concerns are the huge file sizes (images, shortcode mappings, locale files) and maintenance work required to update it every year as the folks from Unicode Consortium keep adding more.

But I'm interested to see how this works. Preprocessed data is nicely done. If you try npm run build, could check out the bundle sizes via Sonda's report (generated sonda-report.html).

@palant
Copy link
Author

palant commented Dec 3, 2024

@palant your prettier is probably running without following the repo's config?

It clearly does respect the repo’s config, it corrects double quotes into single quotes for example. I’m running the same command as that test, yet it finds way more issues for some reason.

  • Are there screenshots or some deployed preview? Would be interesting to see how this works. (I've tried to enable automatic PR/branch deployments from Cloudflare Pages on my side, but doesn't seem to work on PRs opened by others)

These are screenshots made on Linux (KDE):

Emoji selection pop-up with the title “Custom emojis · infosec.exchange” and the groups Blobcat and Smileys & Emotion
Emoji selection pop-up with the title “Custom emojis · infosec.exchange” slightly scrolled down. The groups Blobcat, Smileys & Emotion and People & Body are visible. The text “68 more…” follows the Smiley & Emotion group.
Emoji selection pop-up with the search text “person” entered. A number of emojis with their respective shortcut are listed.
A post composition field with the text “:person” entered. A pop-up lists five emojis with their respective shortcuts as autocomplete options.

Note that I explicitly put the categories of the server’s custom emojis on top (it’s also what Mastodon does). This doesn’t quite work for the “other” category which is always placed at the bottom, but this could be addressed.

  • I prefer this to be separate from Custom Emojis. It's confusing enough to combine both of them, and their short codes could clash.

Quite frankly, I’d consider it more confusing if they were separate. As far as users are concerned, there is no difference between custom emojis and Unicode ones.

Short code clashes don’t really seem to be an issue (other than for React which can be addressed). It’s a bit confusing if multiple icons show up with the same shortcut in the search, but the users can just select the icon they like. And Unicode emojis aren’t inserted as their shortcuts but rather as the actual Unicode symbol.

  • Recent emojis only work for clicked/selected ones.

I am aware of that. Still, they don’t show up for me despite clicking. But that’s a side issue, I can look into it if the general direction is approved.

To be honest, I'm not a fan of adding an emoji picker — I personally use the OS's native emoji picker (and 3rd-party native emoji picker apps) that works across all web sites and apps.

Well, I didn’t add one – there is one already, and it doesn’t do too badly. It could do with a category selector at the top, otherwise I don’t see big issues however.

Linux doesn’t have a native emoji picker, and native emoji pickers aren’t always too convenient. Besides, they won’t let you pick the server’s custom emojis, and providing a unified experience for all emojis no matter the source would be beneficial.

Also, if I wanted to add one, probably could use this https://github.com/nolanlawson/emoji-picker-element but the concerns are the huge file sizes (images, shortcode mappings, locale files) and maintenance work required to update it every year as the folks from Unicode Consortium keep adding more.

I did look into this one first. I then decided that adding the necessary functionality to the existing emoji picker would be simpler. It works with the same data, but its preprocessing won’t reduce file size quite as much. The only benefit I see would be detection of broken emojis – something that would be fairly easy to address.

An emoji font would be quite heavy. The decision whether to use one is independent of an emoji picker however. It would be a way to provide a unified experience across all platforms, including the ones with less than optimal emoji support. There is already #476 on that at the very least.

If you try npm run build, could check out the bundle sizes via Sonda's report (generated sonda-report.html).

The bundle sizes aren’t affected, these JSON files aren’t compiled into the bundles. I considered compiling them into the bundles and using dynamic loading to make sure only the required locales are being loaded, but this turned out unnecessarily complex. So it’s static files and fetch now which is much simpler.

The file sizes vary between 260 KiB (Chinese) and 375 KiB (Ukrainian). Altogether it’s 6.5 MiB which could be reduced to 4.6 MiB by removing data duplicating the en locale.

@palant
Copy link
Author

palant commented Dec 3, 2024

I’m running the same command as that test, yet it finds way more issues for some reason.

Ok, it’s definitely a versions mismatch. I’ve been running prettier 3.4.1, the test is running some version that is considerably older. It isn’t version 2.8.0 which is listed in package-lock.json but that version is closer. I guess the test somehow cached a version from when it was run originally?

If I may suggest, it would be a good idea to add prettier as an explicit devDependency, probably with a script to run it. This will make certain that everyone is on the same version.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants