Skip to content

Latest commit

 

History

History
7 lines (4 loc) · 708 Bytes

1904.06707.md

File metadata and controls

7 lines (4 loc) · 708 Bytes

Rare Words: A Major Problem for Contextualized Embeddings and How to Fix it by Attentive Mimicking, Schick et al., 2019

Paper, Tags: #embeddings

We demonstrate that deep LMs still struggle to understand rare words. We adapt Attentive Mimicking to deep LMs, by using one-token approximation, a procedure enabling us to use AM even when the underlying LM uses subword-based tokenization.

OTA finds an embedding for a multi-token word or phrase w that's similar to the embedding that w would've received if it had been a single token. This allows us to train AM in the usual way by simply minicking the OTA-based embeddings of multi-token words.