Skip to content

Latest commit

 

History

History
27 lines (23 loc) · 1.21 KB

README.md

File metadata and controls

27 lines (23 loc) · 1.21 KB

opus-2020-07-27.zip

  • dataset: opus
  • model: transformer
  • source language(s): asm hin mar urd zza
  • target language(s): asm hin mar urd zza
  • model: transformer
  • pre-processing: normalization + SentencePiece (spm32k,spm32k)
  • a sentence initial language token is required in the form of >>id<< (id = valid target language ID)
  • download: opus-2020-07-27.zip
  • test set translations: opus-2020-07-27.test.txt
  • test set scores: opus-2020-07-27.eval.txt

Benchmarks

testset BLEU chr-F
Tatoeba-test.asm-hin.asm.hin 3.5 0.202
Tatoeba-test.asm-zza.asm.zza 12.4 0.014
Tatoeba-test.hin-asm.hin.asm 6.2 0.238
Tatoeba-test.hin-mar.hin.mar 27.0 0.560
Tatoeba-test.hin-urd.hin.urd 21.4 0.507
Tatoeba-test.mar-hin.mar.hin 13.4 0.463
Tatoeba-test.multi.multi 17.7 0.460
Tatoeba-test.urd-hin.urd.hin 13.4 0.363
Tatoeba-test.zza-asm.zza.asm 5.3 0.000