- dataset: opus
- model: transformer
- source language(s): eng
- target language(s): aze_Latn bak chv crh crh_Latn kaz_Cyrl kaz_Latn kir_Cyrl kjh kum mon nog ota_Arab ota_Latn sah tat tat_Arab tat_Latn tuk tuk_Latn tur tyv uig_Arab uig_Cyrl uzb_Cyrl uzb_Latn xal
- model: transformer
- pre-processing: normalization + SentencePiece (spm32k,spm32k)
- a sentence initial language token is required in the form of
>>id<<
(id = valid target language ID) - download: opus-2020-07-06.zip
- test set translations: opus-2020-07-06.test.txt
- test set scores: opus-2020-07-06.eval.txt
testset | BLEU | chr-F |
---|---|---|
Tatoeba-test.eng-aze.eng.aze | 25.6 | 0.559 |
Tatoeba-test.eng-bak.eng.bak | 9.7 | 0.327 |
Tatoeba-test.eng-chv.eng.chv | 3.4 | 0.281 |
Tatoeba-test.eng-crh.eng.crh | 13.7 | 0.326 |
Tatoeba-test.eng-kaz.eng.kaz | 10.3 | 0.351 |
Tatoeba-test.eng-kir.eng.kir | 18.0 | 0.464 |
Tatoeba-test.eng-kjh.eng.kjh | 1.7 | 0.030 |
Tatoeba-test.eng-kum.eng.kum | 1.7 | 0.024 |
Tatoeba-test.eng-mon.eng.mon | 9.8 | 0.358 |
Tatoeba-test.eng.multi | 17.5 | 0.431 |
Tatoeba-test.eng-nog.eng.nog | 1.1 | 0.057 |
Tatoeba-test.eng-ota.eng.ota | 0.3 | 0.043 |
Tatoeba-test.eng-sah.eng.sah | 0.5 | 0.040 |
Tatoeba-test.eng-tat.eng.tat | 9.4 | 0.295 |
Tatoeba-test.eng-tuk.eng.tuk | 6.1 | 0.315 |
Tatoeba-test.eng-tur.eng.tur | 31.6 | 0.600 |
Tatoeba-test.eng-tyv.eng.tyv | 6.9 | 0.201 |
Tatoeba-test.eng-uig.eng.uig | 0.1 | 0.148 |
Tatoeba-test.eng-uzb.eng.uzb | 2.8 | 0.261 |
Tatoeba-test.eng-xal.eng.xal | 0.1 | 0.040 |
- dataset: opus
- model: transformer
- source language(s): eng
- target language(s): aze_Latn bak chv crh crh_Latn kaz_Cyrl kaz_Latn kir_Cyrl kjh kum mon nog ota_Arab ota_Latn sah tat tat_Arab tat_Latn tuk tuk_Latn tur tyv uig_Arab uig_Cyrl uzb_Cyrl uzb_Latn xal
- model: transformer
- pre-processing: normalization + SentencePiece (spm32k,spm32k)
- a sentence initial language token is required in the form of
>>id<<
(id = valid target language ID) - download: opus-2020-07-14.zip
- test set translations: opus-2020-07-14.test.txt
- test set scores: opus-2020-07-14.eval.txt
testset | BLEU | chr-F |
---|---|---|
Tatoeba-test.eng-aze.eng.aze | 26.6 | 0.570 |
Tatoeba-test.eng-bak.eng.bak | 6.7 | 0.293 |
Tatoeba-test.eng-chv.eng.chv | 3.3 | 0.288 |
Tatoeba-test.eng-crh.eng.crh | 7.9 | 0.364 |
Tatoeba-test.eng-kaz.eng.kaz | 11.9 | 0.361 |
Tatoeba-test.eng-kir.eng.kir | 22.4 | 0.468 |
Tatoeba-test.eng-kjh.eng.kjh | 1.7 | 0.028 |
Tatoeba-test.eng-kum.eng.kum | 2.0 | 0.076 |
Tatoeba-test.eng-mon.eng.mon | 11.6 | 0.369 |
Tatoeba-test.eng.multi | 18.2 | 0.439 |
Tatoeba-test.eng-nog.eng.nog | 1.2 | 0.066 |
Tatoeba-test.eng-ota.eng.ota | 0.2 | 0.039 |
Tatoeba-test.eng-sah.eng.sah | 0.7 | 0.046 |
Tatoeba-test.eng-tat.eng.tat | 10.2 | 0.302 |
Tatoeba-test.eng-tuk.eng.tuk | 5.3 | 0.313 |
Tatoeba-test.eng-tur.eng.tur | 32.9 | 0.611 |
Tatoeba-test.eng-tyv.eng.tyv | 5.2 | 0.170 |
Tatoeba-test.eng-uig.eng.uig | 0.1 | 0.151 |
Tatoeba-test.eng-uzb.eng.uzb | 3.1 | 0.268 |
Tatoeba-test.eng-xal.eng.xal | 0.1 | 0.049 |
- dataset: opus
- model: transformer
- source language(s): eng
- target language(s): aze_Latn bak chv crh crh_Latn kaz_Cyrl kaz_Latn kir_Cyrl kjh kum mon nog ota_Arab ota_Latn sah tat tat_Arab tat_Latn tuk tuk_Latn tur tyv uig_Arab uig_Cyrl uzb_Cyrl uzb_Latn xal
- model: transformer
- pre-processing: normalization + SentencePiece (spm32k,spm32k)
- a sentence initial language token is required in the form of
>>id<<
(id = valid target language ID) - download: opus-2020-07-20.zip
- test set translations: opus-2020-07-20.test.txt
- test set scores: opus-2020-07-20.eval.txt
testset | BLEU | chr-F |
---|---|---|
Tatoeba-test.eng-aze.eng.aze | 26.5 | 0.569 |
Tatoeba-test.eng-bak.eng.bak | 5.4 | 0.274 |
Tatoeba-test.eng-chv.eng.chv | 3.3 | 0.280 |
Tatoeba-test.eng-crh.eng.crh | 12.5 | 0.384 |
Tatoeba-test.eng-kaz.eng.kaz | 10.9 | 0.359 |
Tatoeba-test.eng-kir.eng.kir | 25.6 | 0.501 |
Tatoeba-test.eng-kjh.eng.kjh | 2.4 | 0.046 |
Tatoeba-test.eng-kum.eng.kum | 7.0 | 0.143 |
Tatoeba-test.eng-mon.eng.mon | 10.1 | 0.359 |
Tatoeba-test.eng.multi | 18.4 | 0.441 |
Tatoeba-test.eng-nog.eng.nog | 1.3 | 0.066 |
Tatoeba-test.eng-ota.eng.ota | 0.3 | 0.034 |
Tatoeba-test.eng-sah.eng.sah | 0.8 | 0.054 |
Tatoeba-test.eng-tat.eng.tat | 9.7 | 0.303 |
Tatoeba-test.eng-tuk.eng.tuk | 5.8 | 0.313 |
Tatoeba-test.eng-tur.eng.tur | 33.2 | 0.616 |
Tatoeba-test.eng-tyv.eng.tyv | 6.9 | 0.189 |
Tatoeba-test.eng-uig.eng.uig | 0.1 | 0.151 |
Tatoeba-test.eng-uzb.eng.uzb | 3.1 | 0.283 |
Tatoeba-test.eng-xal.eng.xal | 0.1 | 0.058 |
- dataset: opus
- model: transformer
- source language(s): eng
- target language(s): aze_Latn bak chv crh crh_Latn kaz_Cyrl kaz_Latn kir_Cyrl kjh kum mon nog ota_Arab ota_Latn sah tat tat_Arab tat_Latn tuk tuk_Latn tur tyv uig_Arab uig_Cyrl uzb_Cyrl uzb_Latn xal
- model: transformer
- pre-processing: normalization + SentencePiece (spm32k,spm32k)
- a sentence initial language token is required in the form of
>>id<<
(id = valid target language ID) - download: opus-2020-07-27.zip
- test set translations: opus-2020-07-27.test.txt
- test set scores: opus-2020-07-27.eval.txt
testset | BLEU | chr-F |
---|---|---|
newsdev2016-entr-engtur.eng.tur | 9.6 | 0.427 |
newstest2016-entr-engtur.eng.tur | 8.4 | 0.402 |
newstest2017-entr-engtur.eng.tur | 8.6 | 0.402 |
newstest2018-entr-engtur.eng.tur | 8.6 | 0.404 |
Tatoeba-test.eng-aze.eng.aze | 27.5 | 0.575 |
Tatoeba-test.eng-bak.eng.bak | 5.5 | 0.306 |
Tatoeba-test.eng-chv.eng.chv | 3.3 | 0.284 |
Tatoeba-test.eng-crh.eng.crh | 11.9 | 0.358 |
Tatoeba-test.eng-kaz.eng.kaz | 12.0 | 0.366 |
Tatoeba-test.eng-kir.eng.kir | 24.6 | 0.493 |
Tatoeba-test.eng-kjh.eng.kjh | 2.2 | 0.049 |
Tatoeba-test.eng-kum.eng.kum | 8.4 | 0.176 |
Tatoeba-test.eng-mon.eng.mon | 9.8 | 0.359 |
Tatoeba-test.eng.multi | 18.6 | 0.441 |
Tatoeba-test.eng-nog.eng.nog | 1.6 | 0.079 |
Tatoeba-test.eng-ota.eng.ota | 0.3 | 0.035 |
Tatoeba-test.eng-sah.eng.sah | 0.8 | 0.061 |
Tatoeba-test.eng-tat.eng.tat | 10.1 | 0.308 |
Tatoeba-test.eng-tuk.eng.tuk | 5.7 | 0.310 |
Tatoeba-test.eng-tur.eng.tur | 33.2 | 0.616 |
Tatoeba-test.eng-tyv.eng.tyv | 6.6 | 0.184 |
Tatoeba-test.eng-uig.eng.uig | 0.1 | 0.151 |
Tatoeba-test.eng-uzb.eng.uzb | 3.9 | 0.286 |
Tatoeba-test.eng-xal.eng.xal | 0.1 | 0.057 |
- dataset: opus2m
- model: transformer
- source language(s): eng
- target language(s): aze_Latn bak chv crh crh_Latn kaz_Cyrl kaz_Latn kir_Cyrl kjh kum mon nog ota_Arab ota_Latn sah tat tat_Arab tat_Latn tuk tuk_Latn tur tyv uig_Arab uig_Cyrl uzb_Cyrl uzb_Latn xal
- model: transformer
- pre-processing: normalization + SentencePiece (spm32k,spm32k)
- a sentence initial language token is required in the form of
>>id<<
(id = valid target language ID) - download: opus2m-2020-08-02.zip
- test set translations: opus2m-2020-08-02.test.txt
- test set scores: opus2m-2020-08-02.eval.txt
testset | BLEU | chr-F |
---|---|---|
newsdev2016-entr-engtur.eng.tur | 10.4 | 0.438 |
newstest2016-entr-engtur.eng.tur | 9.1 | 0.414 |
newstest2017-entr-engtur.eng.tur | 9.5 | 0.414 |
newstest2018-entr-engtur.eng.tur | 9.5 | 0.415 |
Tatoeba-test.eng-aze.eng.aze | 27.2 | 0.580 |
Tatoeba-test.eng-bak.eng.bak | 5.8 | 0.298 |
Tatoeba-test.eng-chv.eng.chv | 4.6 | 0.301 |
Tatoeba-test.eng-crh.eng.crh | 6.5 | 0.342 |
Tatoeba-test.eng-kaz.eng.kaz | 11.8 | 0.360 |
Tatoeba-test.eng-kir.eng.kir | 24.6 | 0.499 |
Tatoeba-test.eng-kjh.eng.kjh | 2.2 | 0.052 |
Tatoeba-test.eng-kum.eng.kum | 8.0 | 0.229 |
Tatoeba-test.eng-mon.eng.mon | 10.3 | 0.362 |
Tatoeba-test.eng.multi | 19.5 | 0.451 |
Tatoeba-test.eng-nog.eng.nog | 1.5 | 0.117 |
Tatoeba-test.eng-ota.eng.ota | 0.2 | 0.035 |
Tatoeba-test.eng-sah.eng.sah | 0.7 | 0.080 |
Tatoeba-test.eng-tat.eng.tat | 10.8 | 0.320 |
Tatoeba-test.eng-tuk.eng.tuk | 5.6 | 0.323 |
Tatoeba-test.eng-tur.eng.tur | 34.2 | 0.623 |
Tatoeba-test.eng-tyv.eng.tyv | 8.1 | 0.192 |
Tatoeba-test.eng-uig.eng.uig | 0.1 | 0.158 |
Tatoeba-test.eng-uzb.eng.uzb | 4.2 | 0.298 |
Tatoeba-test.eng-xal.eng.xal | 0.1 | 0.061 |