Paper link: https://arxiv.org/abs/1911.01986
Please cite as:
@incollection{nguyen2021cbd,
title = {Cross-model Back-translated Distillation for Unsupervised Machine Translation},
author = {Xuan-Phi Nguyen and Shafiq Joty and Thanh-Tung Nguyen and Wu Kui and Ai Ti Aw},
booktitle = {38th International Conference on Machine Learning},
year = {2021},
}
These guidelines demonstrate the steps to run CBD on the WMT En-De
Model | Train Dataset | Finetuned model |
---|---|---|
WMT En-Fr |
WMT English-French | model: download |
WMT En-De |
WMT English-German | model: download |
./install.sh
pip install fairseq==0.8.0 --progress-bar off
Following instructions from MASS-paper to create WMT En-De dataset.
Download XLM finetuned model (theta_1): here, save it to bash variable export xlm_path=...
Download MASS finetuned model (theta_2): here, save it to export mass_path=....
Download XLM pretrained model (theta): here, save it to export pretrain_path...
# you may change the inputs in the file according to your context
bash run_ende.sh