Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

translations are out of sync (and always fallbacks to English) #177

Closed
verdy-p opened this issue Mar 1, 2022 · 7 comments
Closed

translations are out of sync (and always fallbacks to English) #177

verdy-p opened this issue Mar 1, 2022 · 7 comments

Comments

@verdy-p
Copy link

verdy-p commented Mar 1, 2022

Many translations made since long in Translatewiki.net are NOT loaded in the repository. So the interface still shows English almost everywhere.
It seems that you do not import these translations either:

  • because you don't extract them all (only a few selected languages)
  • or you selectively apply some strings but not others (or you forgot them)
  • or you've made local changes in translations that were not reflectd by remporting them into translatewiki.net (with a fuzzy state, waiting for a revalidation); for example you've manually removed some trailing full stops in various strings, even if they were present in the English source (but even the English source in TWN still contains that full stop, so all validated commits in TWN continue to apply these final full stops).
  • or... the language selector does not work and only display English (meaning that translations are actually not used at all: if you disabled them, then this project should be deprecated in translationwiki.net, because it is actually not supported)
  • or... the application has a bug in its resource loader (e.g. some file fails to load and generates an exception, which you do not track properly: if I look at your existing JSON file, I do not see any bug such as mismatched quotation marks or braces, or broken embedded markup, excepted by your resource loader), and then always fallbacks to English.

Apparently you've got problems to manage the synchronization (a project repository management problem, affecting all languages) and language fallbacks and message loaders (an undetected programming bug, affecting some languages for unknown reason).

If you don't want to pollute your local curated version, you should IMHO at least create a code branch in GitHub for incoming translations (and make sure it remains constantly in sync with TWN), so that you can still use the branch comparing tools of GitHub to see the differences with your main branch and have a way to trace the work you need.

As a consequence, the online version of the tool displays English in static places.

Example here in French: no French at all displayed anywhere, even if its translation is complete in TWN, but partial/out of sync in your GitHub repository, and on the deployed web site of your tool)
image

@verdy-p verdy-p changed the title translations are out of sync translations are out of sync (and always fallbacks to English) Mar 1, 2022
@eggpi
Copy link
Owner

eggpi commented Mar 1, 2022

This is odd, I definitely see the French strings here:

image

If you can reproduce this consistently in the browser, can you please send me the details of the request you're making?

If you're familiar with the browser developer tools, you can right-click the request and select "Copy > Copy as cURL" to get something like:

curl uhttps://citationhunt.toolforge.org/fr' ... -H 'Accept-Language: en-US,en;q=0.5' ...

That would make it easier for me to see the request and hopefully reproduce the bug. I'm especially interested in the Accept-Language header.

@verdy-p
Copy link
Author

verdy-p commented Mar 7, 2022

You are using a Mac (as seen on your screenshot).

I am using Google Chrome (last version) on Windows 11 Pro.
There's nothing specific in my configuration.


General

URL de requête: https://citationhunt.toolforge.org/fr?id=462d9837
Mode de requête: GET
Code d'état: 200 
Adresse distante: 185.15.56.11:443
Règlement sur les URL de provenance: strict-origin-when-cross-origin

Response Headers

cache-control: public, max-age=30
content-encoding: gzip
content-length: 4751
content-security-policy-report-only: default-src 'self' 'unsafe-eval' 'unsafe-inline' blob: data: filesystem: mediastream: *.toolforge.org wikibooks.org *.wikibooks.org wikidata.org *.wikidata.org wikimedia.org *.wikimedia.org wikinews.org *.wikinews.org wikipedia.org *.wikipedia.org wikiquote.org *.wikiquote.org wikisource.org *.wikisource.org wikiversity.org *.wikiversity.org wikivoyage.org *.wikivoyage.org wiktionary.org *.wiktionary.org *.wmcloud.org *.wmflabs.org wikimediafoundation.org mediawiki.org *.mediawiki.org wss://citationhunt.toolforge.org; report-uri https://csp-report.toolforge.org/collect;
content-type: text/html; charset=utf-8
date: Mon, 07 Mar 2022 02:33:14 GMT
permissions-policy: interest-cohort=()
server: nginx/1.14.2
strict-transport-security: max-age=31622400
vary: Accept-Encoding
x-clacks-overhead: GNU Terry Pratchett

Request headers

:authority: citationhunt.toolforge.org
:method: GET
:path: /fr?id=462d9837
:scheme: https
accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9
accept-encoding: gzip, deflate, br
accept-language: fr-FR,fr;q=0.9,en-US;q=0.8,en;q=0.7
cache-control: no-cache
dnt: 1
pragma: no-cache
sec-ch-ua: " Not A;Brand";v="99", "Chromium";v="99", "Google Chrome";v="99"
sec-ch-ua-mobile: ?0
sec-ch-ua-platform: "Windows"
sec-fetch-dest: document
sec-fetch-mode: navigate
sec-fetch-site: none
sec-fetch-user: ?1
upgrade-insecure-requests: 1
user-agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/99.0.4844.51 Safari/537.36

@verdy-p
Copy link
Author

verdy-p commented Mar 7, 2022

As you see there's "accept-language: fr-FR,fr;q=0.9,en-US;q=0.8,en;q=0.7"

In summary, the browser's language list is: "fr-FR, fr, en-US, en"

It seems that you do not honor "fr-FR" and stop there (not honoring "fr" given just after) and fallback instantly to English only, EVEN if the URL was requesting the French page (https://citationhunt.toolforge.org/fr), whose effect is only to select the language to select sets of citations from Wikimedia pages (but it has NO effect on the UI of your webapp).

I can use the language selector at top to select German, and I see German (and with the same "Accept-Language:" header), when I use that selector again to select French, I only see English.

You seem to confuse the UI language, and the language to use for selecting sets of citations in Wikimedia page with assumption that the Wikimedia language would match the 1st language in Accept-Language. But here you detect "fr-FR", which matches partially "fr" from the selected Wikimedia citation sets. But then the code does not properly load the "fr" resources and attempts to locate "fr-FR" resources (taken from the 1st partial match in Accept-Language), does not find them, and instantly fallbacks to using resources for the English locale.

And I can easily reproduce it if now I setup my browser for Swiss German with "Accept-Language: de-CH, de, en-US, en", then I cannot see the German UI, only the English UI (but for Wikimedia citations in German)

Your language selector then does not work for any selected language that partially matches an Accept-Language entry with a country extension (and such case is VERY frequent in ALL modern browsers that have user settings for Accept-Language).

I have the same issue not just in Google Chrome, but also in MS Edge, or Firefox; and not just on Windows, also in Linux, and Android (I can't test with MacOS or iOS).

This is clearly an issue of your webapp code, unable to select the correct locale for its UI: if you have a partial match on "fr-FR" for a known UI language like "fr" that you support, you should use "fr" (not "fr-FR"), to load the UI locale data; and you should not fallback to English in any case, but to the language of the Wikimedia citation set (as defined by the "/fr" webpage URL defined by the language selector)

@verdy-p
Copy link
Author

verdy-p commented Mar 7, 2022

Apparently the bug is in

https://github.com/eggpi/citationhunt/blob/master/handlers/common.py

in function starting on line 41: it finds a partial match "fr" when parsing the Accept-Language, but then returns exactly its value (which has extensions like "-FR"), i.e. it returns "fr-FR" instead of "fr" (the code that you support and that was partially matched).

Later in the code initialisation, you can't locate resources in "fr-FR", and fallback to loading English only.

@eggpi
Copy link
Owner

eggpi commented Mar 7, 2022

Thank you! I can reproduce it with your Accept-Language and this looks like a regression caused by 6bc5d6b.

I think this is the relevant chunk of the code. Prior to 6bc5d6b, chstrings.get_localized_strings would return {} if it couldn't find a translation file, but now it silently returns the English strings.

So, before, we would:

  1. Try to load 'fr-FR.json' and fail.
  2. Try to load 'fr.json' and succeed.

But now, we try to load 'fr-FR.json' and get back the fallback English strings as if nothing happened.

I'll take another look at this in the next couple of days, but it should be an easy fix. We should make sure to add a test for this.

@verdy-p
Copy link
Author

verdy-p commented Mar 7, 2022

Note: you don't display any logo for your app.
I've uploaded a basic proposal for it, see:

https://translatewiki.net/wiki/File:CitationHunt.png
image

https://translatewiki.net/wiki/Translating:CitationHunt

This is because before, all we had was a screenshot, which did not allow a clear identification in a "Babel" templates (that are limtied to 45x45px), such as this:
https://translatewiki.net/wiki/Template:User_CitationHunt
(this is what appears on user pages)

May be you have a better concept or design. I just made it as a very simplified derivation of your UI. It may probablybe enhanced, or I could make the change if your prefer other colors, or if you ntend to change/improve the UI.

I could also reduce the screenshot size displayed in the portal (eliminating large white margins, so that the essential parts are visible, even on a narrow screen):
https://translatewiki.net/wiki/File:Citation_Hunt_English_screenshot.png (2880×1586px!)
Any thumbnail would be unreadable: this should be reduced by reducing the browser window size first to at most 800x600px, and properly setting the zoom level for font sizes (if you take it from a large display or a HiDPI display), before taking a screenshot. As well precise cropping and centering should be made, keeping small but sufficient and balanced margins. Some little work then in a painting application like Paint++ or GIMP can finalize it so that everyone can see the screenshot correctly. The icon version needs extra work to simplify it more.

@eggpi eggpi closed this as completed in 79c2bea Mar 7, 2022
@eggpi eggpi mentioned this issue Mar 7, 2022
@eggpi
Copy link
Owner

eggpi commented Mar 7, 2022

79c2bea should have fixed this, can you please give it another try?

And thanks for the logo, I like it! I've also filed #178 to incorporate it as a favicon :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants