- dataset: opus
- model: transformer
- source language(s): asm hin mar urd
- target language(s): asm hin mar urd
- model: transformer
- pre-processing: normalization + SentencePiece (spm32k,spm32k)
- a sentence initial language token is required in the form of
>>id<<
(id = valid target language ID) - download: opus-2020-07-27.zip
- test set translations: opus-2020-07-27.test.txt
- test set scores: opus-2020-07-27.eval.txt
testset | BLEU | chr-F |
---|---|---|
Tatoeba-test.asm-hin.asm.hin | 2.6 | 0.231 |
Tatoeba-test.hin-asm.hin.asm | 9.1 | 0.262 |
Tatoeba-test.hin-mar.hin.mar | 28.1 | 0.548 |
Tatoeba-test.hin-urd.hin.urd | 19.9 | 0.508 |
Tatoeba-test.mar-hin.mar.hin | 11.6 | 0.466 |
Tatoeba-test.multi.multi | 17.1 | 0.464 |
Tatoeba-test.urd-hin.urd.hin | 13.5 | 0.377 |
- dataset: opus
- model: transformer
- source language(s): asm awa ben ben_Cyrl ben_Deva ben_Gujr bho dty eng gom guj hif_Latn hin mai mar nep npi ori pan pan_Guru pnb pnb_Guru rmn rmy rom san san_Deva sin snd_Arab urd urd_Deva
- target language(s): asm awa ben ben_Cyrl ben_Deva ben_Gujr bho dty eng gom guj hif_Latn hin mai mar nep npi ori pan pan_Guru pnb pnb_Guru rmn rmy rom san san_Deva sin snd_Arab urd urd_Deva
- model: transformer
- pre-processing: normalization + SentencePiece (spm32k,spm32k)
- a sentence initial language token is required in the form of
>>id<<
(id = valid target language ID) - download: opus-2020-09-26.zip
- test set translations: opus-2020-09-26.test.txt
- test set scores: opus-2020-09-26.eval.txt
testset | BLEU | chr-F |
---|---|---|
newsdev2014-enghin.eng.hin | 6.6 | 0.317 |
newsdev2014-hineng.hin.eng | 9.4 | 0.359 |
newsdev2019-engu-engguj.eng.guj | 5.4 | 0.277 |
newsdev2019-engu-gujeng.guj.eng | 10.3 | 0.336 |
newstest2014-hien-enghin.eng.hin | 9.6 | 0.333 |
newstest2014-hien-hineng.hin.eng | 12.8 | 0.406 |
newstest2019-engu-engguj.eng.guj | 6.3 | 0.284 |
newstest2019-guen-gujeng.guj.eng | 8.0 | 0.314 |
Tatoeba-test.asm-eng.asm.eng | 20.5 | 0.394 |
Tatoeba-test.asm-hin.asm.hin | 8.4 | 0.459 |
Tatoeba-test.awa-eng.awa.eng | 15.5 | 0.292 |
Tatoeba-test.ben-eng.ben.eng | 38.5 | 0.547 |
Tatoeba-test.bho-eng.bho.eng | 30.1 | 0.475 |
Tatoeba-test.eng-asm.eng.asm | 1.9 | 0.255 |
Tatoeba-test.eng-awa.eng.awa | 0.3 | 0.019 |
Tatoeba-test.eng-ben.eng.ben | 13.6 | 0.431 |
Tatoeba-test.eng-bho.eng.bho | 1.1 | 0.061 |
Tatoeba-test.eng-guj.eng.guj | 16.2 | 0.363 |
Tatoeba-test.eng-hif.eng.hif | 1.6 | 0.281 |
Tatoeba-test.eng-hin.eng.hin | 16.6 | 0.450 |
Tatoeba-test.eng-kok.eng.kok | 2.1 | 0.004 |
Tatoeba-test.eng-lah.eng.lah | 0.3 | 0.001 |
Tatoeba-test.eng-mai.eng.mai | 9.3 | 0.412 |
Tatoeba-test.eng-mar.eng.mar | 19.5 | 0.460 |
Tatoeba-test.eng-nep.eng.nep | 0.2 | 0.010 |
Tatoeba-test.eng-ori.eng.ori | 2.4 | 0.225 |
Tatoeba-test.eng-pan.eng.pan | 8.1 | 0.325 |
Tatoeba-test.eng-rom.eng.rom | 2.6 | 0.251 |
Tatoeba-test.eng-san.eng.san | 1.3 | 0.124 |
Tatoeba-test.eng-sin.eng.sin | 9.2 | 0.343 |
Tatoeba-test.eng-snd.eng.snd | 9.1 | 0.377 |
Tatoeba-test.eng-urd.eng.urd | 11.4 | 0.395 |
Tatoeba-test.guj-eng.guj.eng | 16.2 | 0.338 |
Tatoeba-test.hif-eng.hif.eng | 4.8 | 0.308 |
Tatoeba-test.hin-asm.hin.asm | 24.5 | 0.454 |
Tatoeba-test.hin-eng.hin.eng | 37.0 | 0.552 |
Tatoeba-test.hin-mar.hin.mar | 30.8 | 0.596 |
Tatoeba-test.hin-urd.hin.urd | 22.2 | 0.541 |
Tatoeba-test.kok-eng.kok.eng | 3.7 | 0.176 |
Tatoeba-test.lah-eng.lah.eng | 18.4 | 0.295 |
Tatoeba-test.mai-eng.mai.eng | 66.2 | 0.727 |
Tatoeba-test.mar-eng.mar.eng | 31.7 | 0.541 |
Tatoeba-test.mar-hin.mar.hin | 16.1 | 0.540 |
Tatoeba-test.multi.multi | 24.1 | 0.470 |
Tatoeba-test.nep-eng.nep.eng | 20.9 | 0.402 |
Tatoeba-test.ori-eng.ori.eng | 7.9 | 0.263 |
Tatoeba-test.pan-eng.pan.eng | 18.3 | 0.372 |
Tatoeba-test.rom-eng.rom.eng | 6.2 | 0.242 |
Tatoeba-test.san-eng.san.eng | 5.2 | 0.184 |
Tatoeba-test.sin-eng.sin.eng | 24.2 | 0.469 |
Tatoeba-test.snd-eng.snd.eng | 31.2 | 0.454 |
Tatoeba-test.urd-eng.urd.eng | 25.0 | 0.454 |
Tatoeba-test.urd-hin.urd.hin | 24.2 | 0.503 |
- dataset: opus
- model: transformer
- source language(s): asm awa ben ben_Cyrl ben_Deva ben_Gujr bho dty eng gom guj hif_Latn hin mai mar nep npi ori pan pan_Guru pnb pnb_Guru rmn rmy rom san san_Deva sin snd_Arab urd urd_Deva
- target language(s): asm awa ben ben_Cyrl ben_Deva ben_Gujr bho dty eng gom guj hif_Latn hin mai mar nep npi ori pan pan_Guru pnb pnb_Guru rmn rmy rom san san_Deva sin snd_Arab urd urd_Deva
- model: transformer
- pre-processing: normalization + SentencePiece (spm32k,spm32k)
- a sentence initial language token is required in the form of
>>id<<
(id = valid target language ID) - download: opus-2020-10-04.zip
- test set translations: opus-2020-10-04.test.txt
- test set scores: opus-2020-10-04.eval.txt
testset | BLEU | chr-F |
---|---|---|
newsdev2014-enghin.eng.hin | 6.8 | 0.325 |
newsdev2014-hineng.hin.eng | 9.7 | 0.360 |
newsdev2019-engu-engguj.eng.guj | 5.5 | 0.279 |
newsdev2019-engu-gujeng.guj.eng | 11.0 | 0.346 |
newstest2014-hien-enghin.eng.hin | 9.7 | 0.338 |
newstest2014-hien-hineng.hin.eng | 13.3 | 0.413 |
newstest2019-engu-engguj.eng.guj | 6.1 | 0.286 |
newstest2019-guen-gujeng.guj.eng | 8.4 | 0.316 |
Tatoeba-test.asm-eng.asm.eng | 24.4 | 0.418 |
Tatoeba-test.asm-hin.asm.hin | 26.5 | 0.607 |
Tatoeba-test.awa-eng.awa.eng | 15.9 | 0.300 |
Tatoeba-test.ben-eng.ben.eng | 39.8 | 0.554 |
Tatoeba-test.bho-eng.bho.eng | 30.9 | 0.482 |
Tatoeba-test.eng-asm.eng.asm | 2.4 | 0.268 |
Tatoeba-test.eng-awa.eng.awa | 0.3 | 0.029 |
Tatoeba-test.eng-ben.eng.ben | 13.4 | 0.432 |
Tatoeba-test.eng-bho.eng.bho | 1.1 | 0.074 |
Tatoeba-test.eng-guj.eng.guj | 18.8 | 0.378 |
Tatoeba-test.eng-hif.eng.hif | 1.5 | 0.277 |
Tatoeba-test.eng-hin.eng.hin | 16.8 | 0.454 |
Tatoeba-test.eng-kok.eng.kok | 4.2 | 0.005 |
Tatoeba-test.eng-lah.eng.lah | 0.3 | 0.001 |
Tatoeba-test.eng-mai.eng.mai | 15.9 | 0.561 |
Tatoeba-test.eng-mar.eng.mar | 20.5 | 0.469 |
Tatoeba-test.eng-nep.eng.nep | 0.7 | 0.016 |
Tatoeba-test.eng-ori.eng.ori | 1.6 | 0.239 |
Tatoeba-test.eng-pan.eng.pan | 7.5 | 0.321 |
Tatoeba-test.eng-rom.eng.rom | 2.6 | 0.255 |
Tatoeba-test.eng-san.eng.san | 2.2 | 0.128 |
Tatoeba-test.eng-sin.eng.sin | 9.2 | 0.356 |
Tatoeba-test.eng-snd.eng.snd | 3.8 | 0.301 |
Tatoeba-test.eng-urd.eng.urd | 11.7 | 0.399 |
Tatoeba-test.guj-eng.guj.eng | 19.4 | 0.365 |
Tatoeba-test.hif-eng.hif.eng | 4.1 | 0.310 |
Tatoeba-test.hin-asm.hin.asm | 8.9 | 0.387 |
Tatoeba-test.hin-eng.hin.eng | 37.8 | 0.559 |
Tatoeba-test.hin-mar.hin.mar | 32.9 | 0.599 |
Tatoeba-test.hin-urd.hin.urd | 21.8 | 0.534 |
Tatoeba-test.kok-eng.kok.eng | 4.0 | 0.240 |
Tatoeba-test.lah-eng.lah.eng | 18.2 | 0.306 |
Tatoeba-test.mai-eng.mai.eng | 66.2 | 0.724 |
Tatoeba-test.mar-eng.mar.eng | 33.6 | 0.552 |
Tatoeba-test.mar-hin.mar.hin | 15.6 | 0.520 |
Tatoeba-test.multi.multi | 25.1 | 0.478 |
Tatoeba-test.nep-eng.nep.eng | 24.6 | 0.433 |
Tatoeba-test.ori-eng.ori.eng | 6.4 | 0.246 |
Tatoeba-test.pan-eng.pan.eng | 18.4 | 0.376 |
Tatoeba-test.rom-eng.rom.eng | 6.0 | 0.237 |
Tatoeba-test.san-eng.san.eng | 4.2 | 0.182 |
Tatoeba-test.sin-eng.sin.eng | 22.1 | 0.475 |
Tatoeba-test.snd-eng.snd.eng | 34.3 | 0.449 |
Tatoeba-test.urd-eng.urd.eng | 25.9 | 0.462 |
Tatoeba-test.urd-hin.urd.hin | 25.6 | 0.520 |
- dataset: opus
- model: transformer
- source language(s): asm hin mar urd
- target language(s): asm hin mar urd
- model: transformer
- pre-processing: normalization + SentencePiece (spm32k,spm32k)
- a sentence initial language token is required in the form of
>>id<<
(id = valid target language ID) - valid language labels: >>eng<< >>hin<< >>urd<< >>mar<< >>asm<< >>guj<< >>ori<< >>pan_Guru<< >>sin<< >>mai<< >>nep<< >>ben<< >>snd_Arab<< >>rom<< >>rmn<< >>san_Deva<< >>san<< >>ben_Cyrl<< >>dty<< >>rmy<< >>ben_Deva<< >>ben_Gujr<<
- download: opus-2021-02-24.zip
- test set translations: opus-2021-02-24.test.txt
- test set scores: opus-2021-02-24.eval.txt
testset | BLEU | chr-F | #sent | #words | BP |
---|---|---|---|---|---|
newsdev2014.eng-hin | 6.8 | 0.325 | 520 | 9538 | 1.000 |
newsdev2014.hin-eng | 9.7 | 0.360 | 520 | 10406 | 0.884 |
newsdev2019-engu.eng-guj | 5.5 | 0.279 | 1998 | 39137 | 0.766 |
newsdev2019-engu.guj-eng | 11.0 | 0.346 | 1998 | 41862 | 1.000 |
newstest2014-hien.eng-hin | 9.7 | 0.338 | 2507 | 60878 | 0.957 |
newstest2014-hien.hin-eng | 13.3 | 0.413 | 2507 | 55571 | 0.958 |
newstest2019-engu.eng-guj | 6.1 | 0.286 | 998 | 21927 | 0.760 |
newstest2019-guen.guj-eng | 8.4 | 0.316 | 1016 | 17778 | 1.000 |
Tatoeba-test.asm-hin | 26.5 | 0.607 | 4 | 16 | 1.000 |
Tatoeba-test.hin-asm | 8.9 | 0.387 | 4 | 14 | 1.000 |
Tatoeba-test.hin-mar | 32.9 | 0.599 | 158 | 866 | 1.000 |
Tatoeba-test.hin-urd | 21.8 | 0.534 | 239 | 1618 | 1.000 |
Tatoeba-test.mar-hin | 15.6 | 0.518 | 158 | 890 | 1.000 |
Tatoeba-test.multi-multi | 23.3 | 0.538 | 802 | 4985 | 1.000 |
Tatoeba-test.urd-hin | 25.6 | 0.520 | 239 | 1581 | 1.000 |
tico19-test.eng-ben | 4.6 | 0.312 | 2100 | 51751 | 0.757 |
tico19-test.eng-hin | 15.0 | 0.373 | 2100 | 62738 | 0.909 |
tico19-test.eng-mar | 4.3 | 0.277 | 2100 | 50881 | 0.713 |
tico19-test.eng-nep | 5.8 | 0.338 | 2100 | 48706 | 0.777 |
tico19-test.eng-urd | 7.9 | 0.317 | 2100 | 65363 | 0.802 |