Skip to content

paper list and reading notes of adapters in NLP and MT tasks

Notifications You must be signed in to change notification settings

wenlai-lavine/adapter-paper-list

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 

Repository files navigation

Adapter-Papers

Paper list and reading notes of Adapters in Natural Language Processing(NLP) and Machine Translation(MT) tasks.

Contributiing

We Need You!

Please help contribute this list by contacting [email protected] or add pull request

Table of Contents

Natural Language Processing

  • Parameter-Efficient Transfer Learning for NLP (Houlsby et al., 2019)
    • First work to using adapters in NLP tasks.
    • Adapter modules yield a compact and extensible model; they add only a few trainable parameters per task, and new tasks can be added without revisiting previous ones.
    • The parameters of the original network remain fixed, yielding a high degree of parameter sharing.
  • MAD-X: An Adapter-based Framework for Multi-task Cross-lingual Transfer (Pfeiffer et al., 2020)
    • Pretrained multilingual models performance poorly in low-resource language and language unseen in the pretrained phase. so the authors proposed the MAD-X.
    • MAD-X is an adapter-based framework that enables high portability and parameter-efficient transfer to arbitrary tasks and languages by learning modular language and task representations.
    • They also introduce a novel invertible adapter architecture and a strong baseline method for adapting a pretrained multilingual model to a new language.
  • AdapterFusion: Non-Destructive Task Composition for Transfer Learning (Pfeiffer et al., 2021)
    • The motivation is to address the problem in sequential fine-tuning and multi-task: catastrophic forgetting and difficulties in dataset balancing. so the authors proposed AdapterFusion
    • Two stage learning algorithm that leverages knowledge from multiple tasks
      • knowledge extraction stage: learn task specific parameters called adapters, that encapsulate the task-specific information.
      • Combine the adapters in a seperiating knowledge cmoposition step.
  • Efficient Test Time Adapter Ensembling for Low-resource Language Varieties (Wang et al., 2021)
    • Motivations: MAX-D requires training a separate language adapter for every language one wishes to support, which can be impractical for languages with limited data. one solution here is using related langauge adapter for new language variety, but we found that this solution can lead to sub-optimal performance. so the authors aim to improve the robustness of language adapters to uncovered languages without training new adapters.
    • They proposed Entropy Minimized Ensemble of Adapters (EMEA), a. method that optimizes the ensemble weights of the pretrained language adapters for each test sentence by minimizing the entropy of its predictions.
  • On the Effectiveness of Adapter-based Tuning for Pretrained Language Model Adaptation (He et al., 2021)
    • Motivations: existing adapter work only focuses on the parameter-efficient asapect of adapter-vased tuning while lacking further investigation on its effectiveness.
    • They showed that:
      • Adapter-based tuning better mitigates forgetting issues than fine-tuning since it yields representations with less deviation from those generated by the initial pretrained langauage model.
      • Adapter-based tuning outperforms fine-tuning on low-resource and cross-lingual tasks
      • Adapter-based tuning is more robust to overfitting and less sensitive to changes in learning rates.
  • K-Adapter: Infusing Knowledge into Pre-Trained Models with Adapters (Wang et al., 2021)
    • Motivations: they study the problem of injecting knowledge into large pretrained models. traditional methods face the problems: when multiple kinds of knowledge are injected, the historically injected knowledge would be flushed away.
    • They proposed K-Adapters, a framework that retains the original parameters of the pre-trained model fixed and supports the development of versatile knowledge-infused model.

Machine Translation

About

paper list and reading notes of adapters in NLP and MT tasks

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published