It is crucial to verify the quality of samples generated by Transformation
and AttackRecipe
. TextFlint provides several metrics to calculate confidence:
Validator | Description | Reference |
---|---|---|
MaxWordsPerturbed |
Word replacement ratio in the generated text compared with the original text based on LCS. | - |
LevenshteinDistance |
The edit distance between original text and generated text | - |
DeCLUTREncoder |
Semantic similarity calculated based on Universal Sentence Encoder | Universal sentence encoder (https://arxiv.org/pdf/1803.11175.pdf) |
GPT2Perplexity |
Language model perplexity calculated based on the GPT2 model | Language models are unsupervised multitask learners (http://www.persagen.com/files/misc/radford2019language.pdf) |
TranslateScore |
BLEU/METEOR/chrF score | Bleu: a method for automatic evaluation of machine translation (https://www.aclweb.org/anthology/P02-1040.pdf) METEOR: An automatic metric for MT evaluation with improved correlation with human judgments (https://www.aclweb.org/anthology/W05-0909.pdf) chrF: character n-gram F-score for automatic MT evaluation (https://www.aclweb.org/anthology/W15-3049.pdf) |