Skip to content

Latest commit

 

History

History
 
 

eng-sem

opus-2020-06-28.zip

  • dataset: opus
  • model: transformer
  • source language(s): eng
  • target language(s): acm afb amh apc apc_Latn ara ara_Latn arq arq_Latn ary arz heb mlt phn_Phnx tir tmr_Hebr
  • model: transformer
  • pre-processing: normalization + SentencePiece (spm32k,spm32k)
  • a sentence initial language token is required in the form of >>id<< (id = valid target language ID)
  • download: opus-2020-06-28.zip
  • test set translations: opus-2020-06-28.test.txt
  • test set scores: opus-2020-06-28.eval.txt

Benchmarks

testset BLEU chr-F
Tatoeba-test.eng-amh.eng.amh 10.8 0.499
Tatoeba-test.eng-ara.eng.ara 12.6 0.421
Tatoeba-test.eng-heb.eng.heb 32.6 0.557
Tatoeba-test.eng-mlt.eng.mlt 17.9 0.552
Tatoeba-test.eng.multi 22.8 0.487
Tatoeba-test.eng-phn.eng.phn 0.5 0.003
Tatoeba-test.eng-tir.eng.tir 2.5 0.239
Tatoeba-test.eng-tmr.eng.tmr 0.8 0.003

opus-2020-07-06.zip

  • dataset: opus
  • model: transformer
  • source language(s): eng
  • target language(s): acm afb amh apc ara arq ary arz heb mlt tir
  • model: transformer
  • pre-processing: normalization + SentencePiece (spm32k,spm32k)
  • a sentence initial language token is required in the form of >>id<< (id = valid target language ID)
  • download: opus-2020-07-06.zip
  • test set translations: opus-2020-07-06.test.txt
  • test set scores: opus-2020-07-06.eval.txt

Benchmarks

testset BLEU chr-F
Tatoeba-test.eng-amh.eng.amh 10.7 0.465
Tatoeba-test.eng-ara.eng.ara 11.7 0.412
Tatoeba-test.eng-heb.eng.heb 32.3 0.552
Tatoeba-test.eng-mlt.eng.mlt 17.7 0.544
Tatoeba-test.eng.multi 22.2 0.481
Tatoeba-test.eng-tir.eng.tir 2.6 0.236

opus-2020-07-27.zip

  • dataset: opus
  • model: transformer
  • source language(s): eng
  • target language(s): acm afb amh apc ara arq ary arz heb mlt tir
  • model: transformer
  • pre-processing: normalization + SentencePiece (spm32k,spm32k)
  • a sentence initial language token is required in the form of >>id<< (id = valid target language ID)
  • download: opus-2020-07-27.zip
  • test set translations: opus-2020-07-27.test.txt
  • test set scores: opus-2020-07-27.eval.txt

Benchmarks

testset BLEU chr-F
Tatoeba-test.eng-amh.eng.amh 11.0 0.504
Tatoeba-test.eng-ara.eng.ara 12.2 0.412
Tatoeba-test.eng-heb.eng.heb 32.7 0.556
Tatoeba-test.eng-mlt.eng.mlt 17.5 0.548
Tatoeba-test.eng.multi 22.7 0.480
Tatoeba-test.eng-tir.eng.tir 2.4 0.240

opus2m-2020-08-01.zip

  • dataset: opus2m
  • model: transformer
  • source language(s): eng
  • target language(s): acm afb amh apc ara arq ary arz heb mlt tir
  • model: transformer
  • pre-processing: normalization + SentencePiece (spm32k,spm32k)
  • a sentence initial language token is required in the form of >>id<< (id = valid target language ID)
  • download: opus2m-2020-08-01.zip
  • test set translations: opus2m-2020-08-01.test.txt
  • test set scores: opus2m-2020-08-01.eval.txt

Benchmarks

testset BLEU chr-F
Tatoeba-test.eng-amh.eng.amh 11.2 0.480
Tatoeba-test.eng-ara.eng.ara 12.7 0.417
Tatoeba-test.eng-heb.eng.heb 33.8 0.564
Tatoeba-test.eng-mlt.eng.mlt 18.7 0.554
Tatoeba-test.eng.multi 23.5 0.486
Tatoeba-test.eng-tir.eng.tir 2.7 0.248