Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"When" has a wrong IPA #1

Open
zohnannor opened this issue Sep 30, 2024 · 4 comments
Open

"When" has a wrong IPA #1

zohnannor opened this issue Sep 30, 2024 · 4 comments

Comments

@zohnannor
Copy link

zohnannor commented Sep 30, 2024

It translates to hwen instead of when.

image

I somehow didn't notice it when making this meme post and now it looks even more dumb lol. (I only noticed this because of the comment someone left)

@zohnannor
Copy link
Author

With this issue being opened, I wonder if there's other incorrect IPAs in the dictionary.

@aryanpingle
Copy link
Owner

I was aware of this quirk and I've been thinking of a possible solution. Let me break it down:

I'm using the IPA dictionary to convert a word from English to its corresponding phonemes. Here's the entry for "when":

"when": ["h w ɛ n", "h w ɪ n", "w ɛ n", "w ɪ n"]

There are approximately 120,000 words in the dictionary, so it's futile to try to choose the "correct" pronunciation for all of them. Moreover, there is no correct pronunciation because everyone has a different dialect and accent. A simple example is the word "metabolism" - some pronounce the a as æ (like in "apple"), while others use ɑɹ (like in "far"). There's just no way to choose the most appropriate choice for all of them, so I simply took the first pronunciation of the array:

"when": "hwɛn"

A possible solution

That said, I agree that some words like "when" are better off with a community agreed pronunciation. How about we update the IPA dictionary on the client side with a custom dictionary that maps a word to its desired phonetic translation? Or perhaps, a build step that inserts the desired phonetic translation directly into the IPA dictionary.

@zohnannor
Copy link
Author

It's unfortunate that the dictionary contains such mistakes (? I'm not a native English speaker, and I honestly have no idea where the "hwɛn" for "when" is even coming from). I agree that there should be a correction step before image generation, but with 120k words, it doesn't seem feasible to try to support all of them. This means that we would need to add an entry to the correction list for each correction we want to add, when we want to add it. We don't know beforehand all the possible "obvious errors" this dictionary might have, it would require a manual fix each time.

That said, I can't suggest anything specific though. Can't think of a proper way to handle this situation with maintaining the IPA dictionary. In an ideal world, the dictionary wouldn't have errors in the first place, or there would be a clear pattern for its "filtration".

@aryanpingle
Copy link
Owner

Just to clarify, the IPA dictionary isn't technically wrong. In fact, it's great that they provide a list of possible pronunciations. But yes, they should have at least given us a way of choosing which dialect we prefer (American would've been best).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants