Skip to content

Latest commit

 

History

History
 
 

eng-trk

opus-2020-06-28.zip

  • dataset: opus
  • model: transformer
  • source language(s): eng
  • target language(s): aze_Latn bak chv crh crh_Latn kaz_Cyrl kaz_Latn kir_Cyrl kjh kum ota_Arab ota_Latn sah tat tat_Arab tat_Latn tuk tuk_Latn tur tyv uig_Arab uig_Cyrl uzb_Cyrl uzb_Latn
  • model: transformer
  • pre-processing: normalization + SentencePiece (spm32k,spm32k)
  • a sentence initial language token is required in the form of >>id<< (id = valid target language ID)
  • download: opus-2020-06-28.zip
  • test set translations: opus-2020-06-28.test.txt
  • test set scores: opus-2020-06-28.eval.txt

Benchmarks

testset BLEU chr-F
Tatoeba-test.eng-aze.eng.aze 26.4 0.563
Tatoeba-test.eng-bak.eng.bak 4.6 0.254
Tatoeba-test.eng-chv.eng.chv 3.8 0.271
Tatoeba-test.eng-crh.eng.crh 9.5 0.327
Tatoeba-test.eng-kaz.eng.kaz 10.8 0.350
Tatoeba-test.eng-kir.eng.kir 25.8 0.483
Tatoeba-test.eng-kjh.eng.kjh 1.9 0.034
Tatoeba-test.eng-kum.eng.kum 3.2 0.051
Tatoeba-test.eng.multi 18.5 0.443
Tatoeba-test.eng-ota.eng.ota 0.5 0.061
Tatoeba-test.eng-sah.eng.sah 0.8 0.026
Tatoeba-test.eng-tat.eng.tat 9.4 0.292
Tatoeba-test.eng-tuk.eng.tuk 5.2 0.311
Tatoeba-test.eng-tur.eng.tur 32.2 0.605
Tatoeba-test.eng-tyv.eng.tyv 7.6 0.185
Tatoeba-test.eng-uig.eng.uig 0.1 0.147
Tatoeba-test.eng-uzb.eng.uzb 2.2 0.253

opus-2020-07-14.zip

  • dataset: opus
  • model: transformer
  • source language(s): eng
  • target language(s): aze_Latn bak chv crh crh_Latn kaz_Cyrl kaz_Latn kir_Cyrl kjh kum ota_Arab ota_Latn sah tat tat_Arab tat_Latn tuk tuk_Latn tur tyv uig_Arab uig_Cyrl uzb_Cyrl uzb_Latn
  • model: transformer
  • pre-processing: normalization + SentencePiece (spm32k,spm32k)
  • a sentence initial language token is required in the form of >>id<< (id = valid target language ID)
  • download: opus-2020-07-14.zip
  • test set translations: opus-2020-07-14.test.txt
  • test set scores: opus-2020-07-14.eval.txt

Benchmarks

testset BLEU chr-F
Tatoeba-test.eng-aze.eng.aze 25.7 0.560
Tatoeba-test.eng-bak.eng.bak 5.2 0.267
Tatoeba-test.eng-chv.eng.chv 3.7 0.264
Tatoeba-test.eng-crh.eng.crh 7.4 0.301
Tatoeba-test.eng-kaz.eng.kaz 11.4 0.353
Tatoeba-test.eng-kir.eng.kir 25.4 0.496
Tatoeba-test.eng-kjh.eng.kjh 1.3 0.035
Tatoeba-test.eng-kum.eng.kum 2.2 0.046
Tatoeba-test.eng.multi 18.0 0.436
Tatoeba-test.eng-ota.eng.ota 0.2 0.059
Tatoeba-test.eng-sah.eng.sah 0.5 0.021
Tatoeba-test.eng-tat.eng.tat 9.7 0.304
Tatoeba-test.eng-tuk.eng.tuk 5.6 0.305
Tatoeba-test.eng-tur.eng.tur 32.1 0.602
Tatoeba-test.eng-tyv.eng.tyv 4.8 0.224
Tatoeba-test.eng-uig.eng.uig 0.1 0.150
Tatoeba-test.eng-uzb.eng.uzb 3.3 0.264

opus-2020-07-20.zip

  • dataset: opus
  • model: transformer
  • source language(s): eng
  • target language(s): aze_Latn bak chv crh crh_Latn kaz_Cyrl kaz_Latn kir_Cyrl kjh kum ota_Arab ota_Latn sah tat tat_Arab tat_Latn tuk tuk_Latn tur tyv uig_Arab uig_Cyrl uzb_Cyrl uzb_Latn
  • model: transformer
  • pre-processing: normalization + SentencePiece (spm32k,spm32k)
  • a sentence initial language token is required in the form of >>id<< (id = valid target language ID)
  • download: opus-2020-07-20.zip
  • test set translations: opus-2020-07-20.test.txt
  • test set scores: opus-2020-07-20.eval.txt

Benchmarks

testset BLEU chr-F
Tatoeba-test.eng-aze.eng.aze 26.4 0.569
Tatoeba-test.eng-bak.eng.bak 7.1 0.309
Tatoeba-test.eng-chv.eng.chv 2.6 0.267
Tatoeba-test.eng-crh.eng.crh 13.9 0.330
Tatoeba-test.eng-kaz.eng.kaz 12.2 0.362
Tatoeba-test.eng-kir.eng.kir 24.5 0.486
Tatoeba-test.eng-kjh.eng.kjh 2.1 0.042
Tatoeba-test.eng-kum.eng.kum 2.6 0.080
Tatoeba-test.eng.multi 18.6 0.445
Tatoeba-test.eng-ota.eng.ota 0.4 0.059
Tatoeba-test.eng-sah.eng.sah 0.6 0.035
Tatoeba-test.eng-tat.eng.tat 9.6 0.309
Tatoeba-test.eng-tuk.eng.tuk 5.3 0.311
Tatoeba-test.eng-tur.eng.tur 32.9 0.611
Tatoeba-test.eng-tyv.eng.tyv 3.4 0.232
Tatoeba-test.eng-uig.eng.uig 0.1 0.154
Tatoeba-test.eng-uzb.eng.uzb 3.1 0.267

opus-2020-07-27.zip

  • dataset: opus
  • model: transformer
  • source language(s): eng
  • target language(s): aze_Latn bak chv crh crh_Latn kaz_Cyrl kaz_Latn kir_Cyrl kjh kum ota_Arab ota_Latn sah tat tat_Arab tat_Latn tuk tuk_Latn tur tyv uig_Arab uig_Cyrl uzb_Cyrl uzb_Latn
  • model: transformer
  • pre-processing: normalization + SentencePiece (spm32k,spm32k)
  • a sentence initial language token is required in the form of >>id<< (id = valid target language ID)
  • download: opus-2020-07-27.zip
  • test set translations: opus-2020-07-27.test.txt
  • test set scores: opus-2020-07-27.eval.txt

Benchmarks

testset BLEU chr-F
newsdev2016-entr-engtur.eng.tur 9.5 0.423
newstest2016-entr-engtur.eng.tur 8.0 0.397
newstest2017-entr-engtur.eng.tur 7.8 0.394
newstest2018-entr-engtur.eng.tur 8.2 0.396
Tatoeba-test.eng-aze.eng.aze 26.0 0.568
Tatoeba-test.eng-bak.eng.bak 9.2 0.320
Tatoeba-test.eng-chv.eng.chv 3.9 0.266
Tatoeba-test.eng-crh.eng.crh 7.6 0.347
Tatoeba-test.eng-kaz.eng.kaz 10.4 0.352
Tatoeba-test.eng-kir.eng.kir 26.9 0.508
Tatoeba-test.eng-kjh.eng.kjh 2.0 0.052
Tatoeba-test.eng-kum.eng.kum 2.7 0.073
Tatoeba-test.eng.multi 18.8 0.447
Tatoeba-test.eng-ota.eng.ota 0.4 0.064
Tatoeba-test.eng-sah.eng.sah 0.7 0.028
Tatoeba-test.eng-tat.eng.tat 9.6 0.309
Tatoeba-test.eng-tuk.eng.tuk 5.5 0.309
Tatoeba-test.eng-tur.eng.tur 33.4 0.617
Tatoeba-test.eng-tyv.eng.tyv 3.6 0.125
Tatoeba-test.eng-uig.eng.uig 0.1 0.152
Tatoeba-test.eng-uzb.eng.uzb 3.3 0.268

opus2m-2020-08-01.zip

  • dataset: opus2m
  • model: transformer
  • source language(s): eng
  • target language(s): aze_Latn bak chv crh crh_Latn kaz_Cyrl kaz_Latn kir_Cyrl kjh kum ota_Arab ota_Latn sah tat tat_Arab tat_Latn tuk tuk_Latn tur tyv uig_Arab uig_Cyrl uzb_Cyrl uzb_Latn
  • model: transformer
  • pre-processing: normalization + SentencePiece (spm32k,spm32k)
  • a sentence initial language token is required in the form of >>id<< (id = valid target language ID)
  • download: opus2m-2020-08-01.zip
  • test set translations: opus2m-2020-08-01.test.txt
  • test set scores: opus2m-2020-08-01.eval.txt

Benchmarks

testset BLEU chr-F
newsdev2016-entr-engtur.eng.tur 10.1 0.437
newstest2016-entr-engtur.eng.tur 9.2 0.410
newstest2017-entr-engtur.eng.tur 9.0 0.410
newstest2018-entr-engtur.eng.tur 9.2 0.413
Tatoeba-test.eng-aze.eng.aze 26.8 0.577
Tatoeba-test.eng-bak.eng.bak 7.6 0.308
Tatoeba-test.eng-chv.eng.chv 4.3 0.270
Tatoeba-test.eng-crh.eng.crh 8.1 0.330
Tatoeba-test.eng-kaz.eng.kaz 11.1 0.359
Tatoeba-test.eng-kir.eng.kir 28.6 0.524
Tatoeba-test.eng-kjh.eng.kjh 1.0 0.041
Tatoeba-test.eng-kum.eng.kum 2.2 0.075
Tatoeba-test.eng.multi 19.9 0.455
Tatoeba-test.eng-ota.eng.ota 0.5 0.065
Tatoeba-test.eng-sah.eng.sah 0.7 0.030
Tatoeba-test.eng-tat.eng.tat 9.7 0.316
Tatoeba-test.eng-tuk.eng.tuk 5.9 0.317
Tatoeba-test.eng-tur.eng.tur 34.6 0.623
Tatoeba-test.eng-tyv.eng.tyv 5.4 0.210
Tatoeba-test.eng-uig.eng.uig 0.1 0.155
Tatoeba-test.eng-uzb.eng.uzb 3.4 0.275