Skip to content

Latest commit

 

History

History
39 lines (29 loc) · 1.08 KB

2003.08271.md

File metadata and controls

39 lines (29 loc) · 1.08 KB

Pre-trained models for natural language processing: a survey, Qiu et al., 2020

Paper, Tags: #nlp, #architectures

We provide a comprehensive review of PTMs (pre-trained models) in NLP. We propose a taxonomy of PTMs for NLP, categorizing existing PTMs from 4 perspectives

  1. Representation type
    • Contextual, non contextual
  2. Model architecture
    • LSTM, Transformer.
  3. Types of pre-training tasks
    • language modelling (LM), MLM, Seq2seq MLM, etc.
  4. Extensions fo specific types of scenarios
    • multilingual PTMs, multi-model PTMs, domain-specific PTMs, etc.

Adapting PTMs to Downstream tasks

  • Transfer learning
  • How to transfer?
  • Fine-tuning strategies

Applications

  • General evaluation benchmark
  • Question answering
  • Sentiment analysis
  • Named entity recognition
  • Machine translation
  • Summarization
  • Adversarial attacks and defenses

Future directions

  • Upper bound of PTMs
  • Architecture of PTMs
  • Task-oriented pre-training and model compression
  • Knowledge transfer beyond fine-tuning
  • Interpretability and reliability of PTMs