Paper, Tags: #nlp, #architectures
We provide a comprehensive review of PTMs (pre-trained models) in NLP. We propose a taxonomy of PTMs for NLP, categorizing existing PTMs from 4 perspectives
- Representation type
- Contextual, non contextual
- Model architecture
- LSTM, Transformer.
- Types of pre-training tasks
- language modelling (LM), MLM, Seq2seq MLM, etc.
- Extensions fo specific types of scenarios
- multilingual PTMs, multi-model PTMs, domain-specific PTMs, etc.
- Transfer learning
- How to transfer?
- Fine-tuning strategies
- General evaluation benchmark
- Question answering
- Sentiment analysis
- Named entity recognition
- Machine translation
- Summarization
- Adversarial attacks and defenses
- Upper bound of PTMs
- Architecture of PTMs
- Task-oriented pre-training and model compression
- Knowledge transfer beyond fine-tuning
- Interpretability and reliability of PTMs