Skip to content

Latest commit

 

History

History
99 lines (83 loc) · 4.44 KB

README.md

File metadata and controls

99 lines (83 loc) · 4.44 KB

opus-2020-06-28.zip

  • dataset: opus
  • model: transformer
  • source language(s): eng
  • target language(s): dan fao isl non_Latn swe
  • model: transformer
  • pre-processing: normalization + SentencePiece (spm32k,spm32k)
  • a sentence initial language token is required in the form of >>id<< (id = valid target language ID)
  • download: opus-2020-06-28.zip
  • test set translations: opus-2020-06-28.test.txt
  • test set scores: opus-2020-06-28.eval.txt

Benchmarks

testset BLEU chr-F
Tatoeba-test.eng-dan.eng.dan 57.2 0.720
Tatoeba-test.eng-fao.eng.fao 8.2 0.314
Tatoeba-test.eng-isl.eng.isl 23.1 0.500
Tatoeba-test.eng.multi 52.0 0.681
Tatoeba-test.eng-non.eng.non 0.7 0.193
Tatoeba-test.eng-swe.eng.swe 57.4 0.713

opus-2020-07-06.zip

  • dataset: opus
  • model: transformer
  • source language(s): eng
  • target language(s): dan fao isl nno nob nob_Hebr non_Latn swe
  • model: transformer
  • pre-processing: normalization + SentencePiece (spm32k,spm32k)
  • a sentence initial language token is required in the form of >>id<< (id = valid target language ID)
  • download: opus-2020-07-06.zip
  • test set translations: opus-2020-07-06.test.txt
  • test set scores: opus-2020-07-06.eval.txt

Benchmarks

testset BLEU chr-F
Tatoeba-test.eng-dan.eng.dan 57.0 0.719
Tatoeba-test.eng-fao.eng.fao 6.9 0.300
Tatoeba-test.eng-isl.eng.isl 22.6 0.500
Tatoeba-test.eng.multi 52.6 0.684
Tatoeba-test.eng-non.eng.non 1.9 0.189
Tatoeba-test.eng-nor.eng.nor 11.9 0.388
Tatoeba-test.eng-swe.eng.swe 57.4 0.714

opus-2020-07-26.zip

  • dataset: opus
  • model: transformer
  • source language(s): eng
  • target language(s): dan fao isl nno nob nob_Hebr non_Latn swe
  • model: transformer
  • pre-processing: normalization + SentencePiece (spm32k,spm32k)
  • a sentence initial language token is required in the form of >>id<< (id = valid target language ID)
  • download: opus-2020-07-26.zip
  • test set translations: opus-2020-07-26.test.txt
  • test set scores: opus-2020-07-26.eval.txt

Benchmarks

testset BLEU chr-F
Tatoeba-test.eng-dan.eng.dan 57.0 0.719
Tatoeba-test.eng-fao.eng.fao 7.0 0.311
Tatoeba-test.eng-isl.eng.isl 23.3 0.500
Tatoeba-test.eng.multi 52.3 0.683
Tatoeba-test.eng-non.eng.non 0.7 0.196
Tatoeba-test.eng-nor.eng.nor 49.6 0.671
Tatoeba-test.eng-swe.eng.swe 56.9 0.711

opus2m-2020-08-01.zip

  • dataset: opus2m
  • model: transformer
  • source language(s): eng
  • target language(s): dan fao isl nno nob nob_Hebr non_Latn swe
  • model: transformer
  • pre-processing: normalization + SentencePiece (spm32k,spm32k)
  • a sentence initial language token is required in the form of >>id<< (id = valid target language ID)
  • download: opus2m-2020-08-01.zip
  • test set translations: opus2m-2020-08-01.test.txt
  • test set scores: opus2m-2020-08-01.eval.txt

Benchmarks

testset BLEU chr-F
Tatoeba-test.eng-dan.eng.dan 57.7 0.724
Tatoeba-test.eng-fao.eng.fao 9.2 0.322
Tatoeba-test.eng-isl.eng.isl 23.8 0.506
Tatoeba-test.eng.multi 52.8 0.688
Tatoeba-test.eng-non.eng.non 0.7 0.196
Tatoeba-test.eng-nor.eng.nor 50.3 0.678
Tatoeba-test.eng-swe.eng.swe 57.8 0.717