Rare Words: A Major Problem for Contextualized Embeddings and How to Fix it by Attentive Mimicking, Schick et al., 2019

Paper, Tags: #embeddings

We demonstrate that deep LMs still struggle to understand rare words. We adapt Attentive Mimicking to deep LMs, by using one-token approximation, a procedure enabling us to use AM even when the underlying LM uses subword-based tokenization.

OTA finds an embedding for a multi-token word or phrase w that's similar to the embedding that w would've received if it had been a single token. This allows us to train AM in the usual way by simply minicking the OTA-based embeddings of multi-token words.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

1904.06707.md

1904.06707.md

Rare Words: A Major Problem for Contextualized Embeddings and How to Fix it by Attentive Mimicking, Schick et al., 2019

Paper, Tags: #embeddings

Files

1904.06707.md

Latest commit

History

1904.06707.md

File metadata and controls

Rare Words: A Major Problem for Contextualized Embeddings and How to Fix it by Attentive Mimicking, Schick et al., 2019

Paper, Tags: #embeddings