2022-ACL 2022-Understanding and Improving Sequence-to-Sequence Pretraining for Neural Machine Translation #244

thangk · 2024-06-25T20:54:09Z

Link: arXiv

Main problem

Previous encoder-only pertaining approaches produce low translation quality and induce over-estimation issues, and model robustness is not there.

Proposed method

This paper proposes a simple strategy to overcome the limitations of the main problem, via two key components: in-domain pertaining and input adaptation.

My Summary

The author took the approach to jointly pretrain the decoder which produced a more diverse translation in the experiments and also reduced adequacy-related translation errors compared to the encoder-only pretaining approach. After applying the proposed method, the author observed up to 19% improvement in performance in some cases (W19 EN->DE). The end result is improved translation performance and model robustness. The future works include validating the findings with more Seq2Seq pertaining models and language pairs.

Datasets

(1) WMT19 English-German
(2) WMT16 English-Romanian (low resource)
(3) IWSLT17 English-French
(4) a subset from WMT19 English-German (for ablation studies)

thangk added the literature-review Summary of the paper related to the work label Jun 25, 2024

thangk self-assigned this Jun 25, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

2022-ACL 2022-Understanding and Improving Sequence-to-Sequence Pretraining for Neural Machine Translation #244

2022-ACL 2022-Understanding and Improving Sequence-to-Sequence Pretraining for Neural Machine Translation #244

thangk commented Jun 25, 2024

2022-ACL 2022-Understanding and Improving Sequence-to-Sequence Pretraining for Neural Machine Translation #244

2022-ACL 2022-Understanding and Improving Sequence-to-Sequence Pretraining for Neural Machine Translation #244

Comments

thangk commented Jun 25, 2024

Link: arXiv

Main problem

Proposed method

My Summary

Datasets