Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Praise in other languages #33

Open
gaborcsardi opened this issue Apr 15, 2016 · 37 comments
Open

Praise in other languages #33

gaborcsardi opened this issue Apr 15, 2016 · 37 comments

Comments

@gaborcsardi
Copy link
Collaborator

Specifically Chinese first, via @Avatoo. \o/

We need to work out some simple architecture first.

@maelle
Copy link

maelle commented Apr 18, 2016

I can help for praise in French 😄

@gaborcsardi
Copy link
Collaborator Author

@masalmon Cool, I'll soon update the code to handle multiple languages.

@maelle
Copy link

maelle commented Apr 18, 2016

Merveilleux ! Fantastique ! Superbe !

@gaborcsardi
Copy link
Collaborator Author

I am thinking about a good way to do this. The goal would be to be able to write

praise(gettext("Your tests are ${adjective}!"))

or something like this, and then get praise in multiple languages. Two things are required for this:

  1. We need to add some translations to testthat or whatever package we want to add international praise to. This would use the usual NLS system.
  2. We need to add the parts of speech in other languages. E.g. adjectif for French, etc.

E.g. the gettext translates the string above to "Vos tests sont ${adjectif}", and then we just use this template as we are using it now.

Does this make sense? Or do we want to try automatic translation via the google translate API? I guess that could be error prone, so maybe the NLS way is better?

@maelle
Copy link

maelle commented Apr 21, 2016

Would it be a lot of work to test the google translate API on a few examples to see how bad the results are?

@gaborcsardi
Copy link
Collaborator Author

The thing is, even is we can use google translate, we also want a way that lets people have more control. So why not start with that?

@maelle
Copy link

maelle commented Dec 9, 2016

Hi @gaborcsardi following up on this -- a bit late sorry. What exactly could I do to help make praise work for French too (apart from contributing words)?

@chucheria would like to contribute for Spanish.

@gaborcsardi
Copy link
Collaborator Author

@masalmon Thanks!

I guess we would need to decide what "work" means. I.e. consider testthat praise. It is implemented like this:

praise::praise("Your tests are ${adjective}!")
praise::praise("${EXCLAMATION} - ${adjective} code.")

So how would (hypothetical) user Hadley add support for other languages? Or all this would be automatic? The two obvious solutions are:

  1. We translate all non-template words via Google translate or sg. similar in praise, if we detect a French locale. And then just substitute in the templated words, i.e. the nice adjectives to get a sentence in French.
  2. We require user Hadley to supply templates in various languages. We might help user Hadley with translation tips via an automatic translation service.

The first solution is nice if it works well, and maybe it works well for simple templates. Maybe we can implement both solutions.

What do you think?

@maelle
Copy link

maelle commented Dec 9, 2016

I guess the first solution is easier? Or in the case of a package like testthat, I could translate all the templates, because there are not many anyway?

Also, the idea would be to have people contribute the nice adjectives (because then you only need to know your language and a few git commands), but I guess that part is easy.

@gaborcsardi
Copy link
Collaborator Author

I guess the first solution is easier? Or in the case of a package like testthat, I could translate all the templates, because there are not many anyway?

Anyway, maybe we can implement both? Let's implement the automatic way, and see how it works. Btw. Google translate is not free any more, but maybe this works: http://www.r-pkg.org/pkg/RYandexTranslate

Also, the idea would be to have people contribute the nice adjectives (because then you only need to know your language and a few git commands), but I guess that part is easy.

Agreed.

gaborcsardi added a commit that referenced this issue Dec 13, 2016
@maelle
Copy link

maelle commented Dec 17, 2016

I've just installed RYandexTranslate & registered for the free service (at last!). They seem to use two-letters language code.

I've also looked at your commit regarding language detection, is there a particular reason you use Sys.getlocale() instead of Sys.getlocale(category = "LC_COLLATE")?

@maelle
Copy link

maelle commented Dec 17, 2016

Last very small things for today, I looked at praise code in testthat and the praising and encouraging sentences are "hard-coded". Should praise have categories for this (a "english_congratulation.R" and "english_encouragement.R"), and can we hope to have them replaced in testthat?

The Yandex API works well for the unique sentence to be translated in testthat:

> translate(api_key, text = "Your tests are", lang = "en-fr")
$lang
[1] "en-fr"

$text
[1] "Vos tests sont"

@maelle
Copy link

maelle commented Dec 17, 2016

I've just realized that in languages like French ${adjective} will need to be ${singular-adjective} and ${plural-adjective}.

@gaborcsardi
Copy link
Collaborator Author

I've also looked at your commit regarding language detection, is there a particular reason you use Sys.getlocale() instead of Sys.getlocale(category = "LC_COLLATE")?

DOn't remember. Looks like this is what I am doing: 0ca9979#diff-951791f1fb37d9e5b0f0cf852ce38d83R30

I suppose we can add LC_COLLATE here as well, I don't really see why you would have that set up and the others not, but I don't know much about locales.

. Should praise have categories for this (a "english_congratulation.R" and "english_encouragement.R"), and can we hope to have them replaced in testthat?

Maybe, but in general I would leave writing sentences up to package authors depending on praise.

I've just realized that in languages like French ${adjective} will need to be ${singular-adjective} and ${plural-adjective}.

Hmmm, yeah, that's a problem, and more "complicated" languages will be even worse.

So I would keep it simple and use the auto-translation for suggestions only. Maybe the manual praise translation is even better, then people speaking various languages can just contribute translations to testthat and other praising packages. How about this?

@maelle
Copy link

maelle commented Dec 19, 2016

  • On my PC (Windpws)
> Sys.getlocale()
[1] "LC_COLLATE=Spanish_Spain.1252;LC_CTYPE=Spanish_Spain.1252;LC_MONETARY=Spanish_Spain.1252;LC_NUMERIC=C;LC_TIME=Spanish_Spain.1252"
> Sys.getlocale("LC_COLLATE")
[1] "Spanish_Spain.1252"

so the substr wouldn't work with Sys.getlocale()?

  • I don't understand what you mean by manual praise translation? How would this work for the praising packages?

@gaborcsardi
Copy link
Collaborator Author

gaborcsardi commented Dec 19, 2016

so the substr wouldn't work with Sys.getlocale()?

OK, we'll need to read more about locales I suppose. Or find good code that gives a two or three letter code from the locales.

I don't understand what you mean by manual praise translation? How would this work for the praising packages?

Package author writes the sentences in all languages she knows. (She can get help from auto-translation, but I would implement auto-translation later.) Then people that know other languages can submit pull requests that add support for other languages that praise supports. I think this is good, because it encourages collaboration.

@maelle
Copy link

maelle commented Dec 19, 2016

  • Ok, I'll try to make myself wiser about locales in the next weeks.

  • And could some examples be kept in the praise package itself if they are general sentences?

@gaborcsardi
Copy link
Collaborator Author

And could some examples be kept in the praise package itself if they are general sentences?

Sure, that makes a lot of sense. We can have a praise_code() function or praise_package() or some generic function, e.g. praise_this("package").

@maelle
Copy link

maelle commented Dec 19, 2016

What would the praise_this("package") function do? Create the infrastructure for recognizing language?

@gaborcsardi
Copy link
Collaborator Author

gaborcsardi commented Dec 19, 2016

Oh, no, sorry, these would be just praising sentences that are kept within praise, and they could be translated to all languages we support.

@maelle
Copy link

maelle commented Dec 21, 2016

There's a R package for plurals but only in English, what a pity: https://github.com/hrbrmstr/pluralize

@gaborcsardi
Copy link
Collaborator Author

@masalmon No prob, if we go the "manual" way, we don't really need that.

Btw. I think hunspell can do this for all languages that it supports, but we don't need to worry about it now.

@maelle
Copy link

maelle commented Jan 15, 2017

Just a summary of the discussion (cc @chucheria ) @gaborcsardi please correct me if I'm wrong which I quite likely am :-)

  • The international branch of this package has e.g. english-adverbs.R, for each new language we have to add all the corresponding .R. @chucheria & I could create these files and they'd be filled during git workshop, even if the rest of the international structure of the package isn't ready, because these collections of words will still get useful at some point.

  • The code for recognizing the locale needs to be improved a bit. Note, we'll have to write the correspondance between a 2/3-letter language code and the full name of the language.

  • The code for recognizing the locale will be used in generic functions inside praise.

  • However, the international possibilities will be useful only if

  1. maintainers of packages using praise, e.g. like testthat do, accept to see their R code modified so that it includes recognition of the locale,
  2. volunteers submit translations of sentences of the package to the package maintainer,
  3. so that if the locale is a language other than English that is offered by praise + the package itself (you need the adverbs in Spanish in praise and the sentences in Spanish in testthat for instance), the package can output messages in this language.

@gaborcsardi
Copy link
Collaborator Author

I would not put locale stuff in testthat & co, I would just do sg. like

praise_lang("You are ${adjective}!", lang = "en")
praise_lang("Du bist ${adjective}!", lang = "de")

or sg like this.

@gaborcsardi
Copy link
Collaborator Author

Or even just

praise("You are ${adjective}!", lang = "en")
praise("Du bist ${adjective}!", lang = "de")

or

praise(
  en = "You are ${adjective}!",
  de = "Du bist ${adjective}!"
)

@gaborcsardi
Copy link
Collaborator Author

Another way would be to use gettext...

@maelle
Copy link

maelle commented Feb 8, 2017

What is gettext?

@gaborcsardi
Copy link
Collaborator Author

The standard way to translate text messages. See ?gettext.

@gaborcsardi
Copy link
Collaborator Author

So with gettext, people could just write

praise("You are ${adjective}!")

as before, but then praise() would check if the "You are ${adjective}!" string has a translation in the current locale, either

  1. in the calling package, or
  2. in praise itself.
    After the translation, we would do the templating, as before, using the detected language.

Then the messages would need to be translated using e.g. msgtools. But the words lists would be the same as before.

@maelle
Copy link

maelle commented Feb 8, 2017

This sounds like the easiest solution?

@gaborcsardi
Copy link
Collaborator Author

For the users, yes. Even for people adding new words.

For people dealing with the translation system (=us), not really. :)

@maelle
Copy link

maelle commented Feb 8, 2017

But then we can praise ourselves ;-)

@gaborcsardi
Copy link
Collaborator Author

OK, I implemented a framework: https://github.com/rladies/praise/tree/international

I'll write a short guide on how to add translations, and then we can test it on you if you don't mind. :)

Btw. we'll need to re-organize the package a bit, because non-ASCII characters are not allowed in code. So I'll move the words to data/ or inst/.

@maelle
Copy link

maelle commented Feb 8, 2017

Awesome! Looking forward to testing it.

Génial ! J'ai hâte de le tester !

@gaborcsardi
Copy link
Collaborator Author

Here is a short how-to: https://github.com/rladies/praise/blob/international/inst/international.md

I have added Hungarian, not too many words, just s PoC.

FYI.

@maelle
Copy link

maelle commented Feb 23, 2017

I'll have a better look next week but this looks AWESOME! 👏👏👏

@acangros
Copy link

For other languages it's important to make the difference for genre, its not the same expressions for men than for women, can change the written and also the meaning from very good to very bad ^_^

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants