- dataset: opus
- model: transformer
- source language(s): eng
- target language(s): asm awa ben bho gom guj hif_Latn hin mai mar npi ori pan_Guru pnb rom sin snd_Arab urd
- model: transformer
- pre-processing: normalization + SentencePiece (spm32k,spm32k)
- a sentence initial language token is required in the form of
>>id<<
(id = valid target language ID) - download: opus-2020-06-28.zip
- test set translations: opus-2020-06-28.test.txt
- test set scores: opus-2020-06-28.eval.txt
testset | BLEU | chr-F |
---|---|---|
Tatoeba-test.eng-asm.eng.asm | 3.0 | 0.245 |
Tatoeba-test.eng-awa.eng.awa | 0.4 | 0.098 |
Tatoeba-test.eng-ben.eng.ben | 16.5 | 0.481 |
Tatoeba-test.eng-bho.eng.bho | 0.8 | 0.110 |
Tatoeba-test.eng-guj.eng.guj | 19.9 | 0.393 |
Tatoeba-test.eng-hif.eng.hif | 0.5 | 0.022 |
Tatoeba-test.eng-hin.eng.hin | 17.4 | 0.463 |
Tatoeba-test.eng-kok.eng.kok | 8.1 | 0.006 |
Tatoeba-test.eng-lah.eng.lah | 0.2 | 0.001 |
Tatoeba-test.eng-mai.eng.mai | 7.6 | 0.374 |
Tatoeba-test.eng-mar.eng.mar | 20.4 | 0.464 |
Tatoeba-test.eng.multi | 17.0 | 0.442 |
Tatoeba-test.eng-nep.eng.nep | 1.0 | 0.102 |
Tatoeba-test.eng-ori.eng.ori | 2.2 | 0.198 |
Tatoeba-test.eng-pan.eng.pan | 8.4 | 0.343 |
Tatoeba-test.eng-rom.eng.rom | 0.3 | 0.185 |
Tatoeba-test.eng-sin.eng.sin | 9.5 | 0.368 |
Tatoeba-test.eng-snd.eng.snd | 6.8 | 0.343 |
Tatoeba-test.eng-urd.eng.urd | 12.5 | 0.414 |
- dataset: opus
- model: transformer
- source language(s): eng
- target language(s): asm awa ben bho gom guj hif_Latn hin mai mar npi ori pan_Guru pnb rom san_Deva sin snd_Arab urd
- model: transformer
- pre-processing: normalization + SentencePiece (spm32k,spm32k)
- a sentence initial language token is required in the form of
>>id<<
(id = valid target language ID) - download: opus-2020-07-06.zip
- test set translations: opus-2020-07-06.test.txt
- test set scores: opus-2020-07-06.eval.txt
testset | BLEU | chr-F |
---|---|---|
Tatoeba-test.eng-asm.eng.asm | 3.6 | 0.277 |
Tatoeba-test.eng-awa.eng.awa | 0.4 | 0.144 |
Tatoeba-test.eng-ben.eng.ben | 15.9 | 0.466 |
Tatoeba-test.eng-bho.eng.bho | 0.6 | 0.152 |
Tatoeba-test.eng-guj.eng.guj | 20.9 | 0.380 |
Tatoeba-test.eng-hif.eng.hif | 0.6 | 0.032 |
Tatoeba-test.eng-hin.eng.hin | 17.2 | 0.461 |
Tatoeba-test.eng-kok.eng.kok | 3.3 | 0.022 |
Tatoeba-test.eng-lah.eng.lah | 0.3 | 0.007 |
Tatoeba-test.eng-mai.eng.mai | 8.9 | 0.392 |
Tatoeba-test.eng-mar.eng.mar | 20.1 | 0.463 |
Tatoeba-test.eng.multi | 16.8 | 0.439 |
Tatoeba-test.eng-nep.eng.nep | 0.6 | 0.058 |
Tatoeba-test.eng-ori.eng.ori | 2.2 | 0.187 |
Tatoeba-test.eng-pan.eng.pan | 9.6 | 0.351 |
Tatoeba-test.eng-rom.eng.rom | 0.4 | 0.188 |
Tatoeba-test.eng-san.eng.san | 1.5 | 0.111 |
Tatoeba-test.eng-sin.eng.sin | 9.1 | 0.370 |
Tatoeba-test.eng-snd.eng.snd | 1.9 | 0.235 |
Tatoeba-test.eng-urd.eng.urd | 12.7 | 0.412 |
- dataset: opus
- model: transformer
- source language(s): eng
- target language(s): asm awa ben bho gom guj hif_Latn hin mai mar npi ori pan_Guru pnb rom san_Deva sin snd_Arab urd
- model: transformer
- pre-processing: normalization + SentencePiece (spm32k,spm32k)
- a sentence initial language token is required in the form of
>>id<<
(id = valid target language ID) - download: opus-2020-07-26.zip
- test set translations: opus-2020-07-26.test.txt
- test set scores: opus-2020-07-26.eval.txt
testset | BLEU | chr-F |
---|---|---|
newsdev2014-enghin.eng.hin | 7.5 | 0.337 |
newsdev2019-engu-engguj.eng.guj | 6.3 | 0.282 |
newstest2014-hien-enghin.eng.hin | 11.0 | 0.358 |
newstest2019-engu-engguj.eng.guj | 7.1 | 0.291 |
Tatoeba-test.eng-asm.eng.asm | 3.7 | 0.260 |
Tatoeba-test.eng-awa.eng.awa | 0.4 | 0.144 |
Tatoeba-test.eng-ben.eng.ben | 16.0 | 0.466 |
Tatoeba-test.eng-bho.eng.bho | 0.6 | 0.143 |
Tatoeba-test.eng-guj.eng.guj | 20.2 | 0.375 |
Tatoeba-test.eng-hif.eng.hif | 0.5 | 0.040 |
Tatoeba-test.eng-hin.eng.hin | 17.3 | 0.462 |
Tatoeba-test.eng-kok.eng.kok | 3.3 | 0.044 |
Tatoeba-test.eng-lah.eng.lah | 0.2 | 0.005 |
Tatoeba-test.eng-mai.eng.mai | 9.3 | 0.385 |
Tatoeba-test.eng-mar.eng.mar | 19.9 | 0.461 |
Tatoeba-test.eng.multi | 16.6 | 0.436 |
Tatoeba-test.eng-nep.eng.nep | 0.7 | 0.067 |
Tatoeba-test.eng-ori.eng.ori | 2.2 | 0.196 |
Tatoeba-test.eng-pan.eng.pan | 7.0 | 0.342 |
Tatoeba-test.eng-rom.eng.rom | 0.4 | 0.187 |
Tatoeba-test.eng-san.eng.san | 1.7 | 0.109 |
Tatoeba-test.eng-sin.eng.sin | 9.1 | 0.365 |
Tatoeba-test.eng-snd.eng.snd | 5.6 | 0.343 |
Tatoeba-test.eng-urd.eng.urd | 12.9 | 0.411 |
- dataset: opus2m
- model: transformer
- source language(s): eng
- target language(s): asm awa ben bho gom guj hif_Latn hin mai mar npi ori pan_Guru pnb rom san_Deva sin snd_Arab urd
- model: transformer
- pre-processing: normalization + SentencePiece (spm32k,spm32k)
- a sentence initial language token is required in the form of
>>id<<
(id = valid target language ID) - download: opus2m-2020-08-01.zip
- test set translations: opus2m-2020-08-01.test.txt
- test set scores: opus2m-2020-08-01.eval.txt
testset | BLEU | chr-F |
---|---|---|
newsdev2014-enghin.eng.hin | 8.2 | 0.342 |
newsdev2019-engu-engguj.eng.guj | 6.5 | 0.293 |
newstest2014-hien-enghin.eng.hin | 11.4 | 0.364 |
newstest2019-engu-engguj.eng.guj | 7.2 | 0.296 |
Tatoeba-test.eng-asm.eng.asm | 2.7 | 0.277 |
Tatoeba-test.eng-awa.eng.awa | 0.5 | 0.132 |
Tatoeba-test.eng-ben.eng.ben | 16.7 | 0.470 |
Tatoeba-test.eng-bho.eng.bho | 4.3 | 0.227 |
Tatoeba-test.eng-guj.eng.guj | 17.5 | 0.373 |
Tatoeba-test.eng-hif.eng.hif | 0.6 | 0.028 |
Tatoeba-test.eng-hin.eng.hin | 17.7 | 0.469 |
Tatoeba-test.eng-kok.eng.kok | 1.7 | 0.000 |
Tatoeba-test.eng-lah.eng.lah | 0.3 | 0.028 |
Tatoeba-test.eng-mai.eng.mai | 15.6 | 0.429 |
Tatoeba-test.eng-mar.eng.mar | 21.3 | 0.477 |
Tatoeba-test.eng.multi | 17.3 | 0.448 |
Tatoeba-test.eng-nep.eng.nep | 0.8 | 0.081 |
Tatoeba-test.eng-ori.eng.ori | 2.2 | 0.208 |
Tatoeba-test.eng-pan.eng.pan | 8.0 | 0.347 |
Tatoeba-test.eng-rom.eng.rom | 0.4 | 0.197 |
Tatoeba-test.eng-san.eng.san | 0.5 | 0.108 |
Tatoeba-test.eng-sin.eng.sin | 9.1 | 0.364 |
Tatoeba-test.eng-snd.eng.snd | 4.4 | 0.284 |
Tatoeba-test.eng-urd.eng.urd | 13.3 | 0.423 |