v1.2.0
(see also the mailing list announcement)
This release comes courtesy of Nynorsk pressekontor / NPK, with funding from the Norwegian Ministry of Culture. There has been some press about the project.
NPK have been using apertium-nno-nob in production since fall 2018 – it's integrated into their translation/editing systems – and we've been continually improving it with the help of their post-edits and feedback. The form/spelling/style choices used by nob→nno are now more modern and uniform (there was a major release of Nynorsk back in 2012, while most style decisions in the translator were made in the first release back in 2009).
Other major changes to the pair:
- 35 new transfer rules (one of which required a bugfix to apertium-transfer
- 248 new lrx rules
- about 42.000 new names and 3.800 new non-names added to bidix
- regression testing by checking that WER does not drop
- lots of work on nob disambiguation
- we now do long-distance adjective congruence
- there's a post-nno.dix to get rid of triple consonants resulting from
compounding - compounding happens on proper nouns too now
- genitives are translated not just by preposition-rewriting, but we now also have:
- lists of exceptions where we want to keep genitives
- rewriting some nouns with relatives
- rewriting nationalities with adjectives
- rewriting some abstract nouns into compounds
Below is the median/mean WER on a test set of 1135 NTB news articles that were post-edited with the git checkout of January 2019, evaluated with various git checkouts of apertium-nno-nob:
| git date | median WER | mean WER | stdev |
|------------+------------+----------+-------|
| 2018-10-01 | 11.79 | 12.96 | 7.49 |
| 2018-10-31 | 9.68 | 10.96 | 7.28 |
| 2018-12-20 | 7.26 | 8.52 | 7.05 |
| 2019-02-28 | 6.77 | 8.04 | 7.04 |
(apertium-eval-translator was run once for each of the 1135 articles, for each of the checkouts of the translator+deps)