Skip to content

Latest commit

 

History

History
15 lines (8 loc) · 1.92 KB

2007.00263.md

File metadata and controls

15 lines (8 loc) · 1.92 KB

It's morphin' time! Combating linguistic discrimination with inflectional perturbations, Tan et al., 2020

Paper, Tags: #nlp, #datasets

Training on only perfect Standard English corpora predisposes pre-traned models to discriminate against minorities from non-standard linguistic background (African American vernacular English, colloquial Singapore English, etc.) We perturb the inflectional and semantically similar adversarial examples that expose these biases in popular NLP models and show that adversarially fine-tuning them for a single epoch significantly improves robustness without sacrificing performance on clean data.

We propose MORPHEUS, a method for generating plausible and semantically similar adversaries by perturbing the inflections in the clean examples. We exploit morphology to craft our adversaries.

It seems like the changes are just changing singular/plural, changing past/present tense, and first/second/person. Thereby making mistakes in the sentence by mistake. And the authors argue this is how dialect and L2 English speakers speak?

Models trained on contextual embeddings are more robust than those using fixed word embeddings. Pre-training gives models greater exposure to a wider variety of contexts in which different inflections occur.

Instead of retraining the model, fine-tuning it on a representative adversarial training set for a single epoch is sufficient to achieve significant robustness to inflectional adversaries while preserving perfomance on the clean dataset.

MORPHEUS finds the distribution of examples that are adversarial for the target model, rather than that of real L2 speaker errors, which produces some unrealistic adversarial examples. Our method of adversarial fine-tuning is analogous to curing the symptom rather than addressing the root cause, since it would have to be perfomed for each domain-specific dataset the model is trained on.