We release checkpoints for two new metrics that are build on top of the COMET architecture. It was fine-tuned on several times larger collection of human judgement containing 15 domains and 113 languages.
COMET requires python 3.8 or above. The simple installation from PyPI
pip install --upgrade pip # ensures that pip is current
pip install unbabel-comet
or
pip install unbabel-comet==1.1.2 --use-feature=2020-resolver
Download and unzip models: https://aka.ms/MS-COMET-checkpoints
For more details, check: https://github.com/Unbabel/COMET
Basic scoring command:
comet-score -s src.de -t hyp1.en -r ref.en --model PATH/TO/CHECKPOINT
where PATH/TO/CHECKPOINT depends on selected model:
reference-based MS-COMET-22:
checkpoints/MS-COMET-22/model/MS-COMET-22.ckpt
quality estimation MS-COMET-QE-22
checkpoints/MS-COMET-QE-22/model/MS-COMET-QE-22.ckpt
For more details and other usages, follow https://github.com/Unbabel/COMET
@inproceedings{kocmi2022mscomet,
title = {MS-COMET: More and Better Human Judgements Improve Metric Performance},
author = {Tom Kocmi and Hitokazu Matsushita and Christian Federmann},
booktitle = {Proceedings of the Seventh Conference on Machine Translation},
month = {December},
year = {2022},
address = {Online},
publisher = {Association for Computational Linguistics}
}