Skip to content

Latest commit

 

History

History
357 lines (216 loc) · 26 KB

readme.md

File metadata and controls

357 lines (216 loc) · 26 KB

Paper List for In-context Learning

Contents

Introduction

This is a paper list (working in progress) about In-context learning

Keywords Convention

abbreviation

section in our survey

main feature

conference

Papers

Survey

  1. A Survey for In-context Learning.

    Qingxiu Dong, Lei Li, Damai Dai, Ce Zheng, Zhiyong Wu, Baobao Chang, Xu Sun, Jingjing Xu, Lei Li, Zhifang Sui. [pdf], 2022.12,

Model Warmup for ICL

This section contains the pilot works that might contributes to the warmup strategies of ICL.

  1. MetaICL: Learning to Learn In Context NAACL 2022 a pretrained language model is tuned to do in-context learning on a large set of training tasks.

    Sewon Min, Mike Lewis, Luke Zettlemoyer, Hannaneh Hajishirzi. [pdf], [project], 2021.10,

  2. Improving In-Context Few-Shot Learning via Self-Supervised Training.

    Mingda Chen, Jingfei Du, Ramakanth Pasunuru, Todor Mihaylov, Srini Iyer, Veselin Stoyanov, Zornitsa Kozareva. [pdf], [project], 2022.5,

  3. Calibrate Before Use: Improving Few-shot Performance of Language Models.

    Zihao Zhao, Eric Wallace, Shi Feng, Dan Klein, Sameer Singh. [pdf], [project], 2021.2,

    • Using N/A string to calibrate LMs away from common token bias

Prompt Tuning for ICL

This section contains the pilot works that might contributes to the prompt selection and prompt formulation strategies of ICL.

  1. On the Effect of Pretraining Corpora on In-context Learning by a Large-scale Language Model.

    Seongjin Shin, Sang-Woo Lee, Hwijeen Ahn, Sungdong Kim, HyoungSeok Kim, Boseop Kim, Kyunghyun Cho, Gichang Lee, Woomyoung Park, Jung-Woo Ha, Nako Sung. [pdf], 2022.04,

    • how in-context learning performance changes as the training corpus varies, investigate the effects of the source and size of the pretraining corpus on in-context learning
  2. Chain of Thought Prompting Elicits Reasoning in Large Language Models.

    Jason Wei, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Brian Ichter, Fei Xia, Ed Chi, Quoc Le, Denny Zhou. [pdf], 2022.01,

  3. Least-to-Most Prompting Enables Complex Reasoning in Large Language Models.

    Denny Zhou, Nathanael Schärli, Le Hou, Jason Wei, Nathan Scales, Xuezhi Wang, Dale Schuurmans, Claire Cui, Olivier Bousquet, Quoc Le, Ed Chi. [pdf], 2022.05,

  4. Self-Generated In-Context Learning: Leveraging Auto-regressive Language Models as a Demonstration Generator.

    Hyuhng Joon Kim, Hyunsoo Cho, Junyeob Kim, Taeuk Kim, Kang Min Yoo, Sang-goo Lee. [pdf], 2022.06,

  5. Iteratively Prompt Pre-trained Language Models for Chain of Thought.

    Boshi Wang, Xiang Deng, Huan Sun. [pdf], [project], 2022.03,

  6. Automatic Chain of Thought Prompting in Large Language Models.

    Zhuosheng Zhang, Aston Zhang, Mu Li, Alex Smola. [pdf], [project], 2022.10,

  7. Learning To Retrieve Prompts for In-Context Learning NAACL 2022 Learn an example retriever via contrastive learning.

    Ohad Rubin, Jonathan Herzig, Jonathan Berant. [pdf], [project], 2022.12,

  8. Finetuned Language Models Are Zero-Shot Learners instruction tuning.

    Jason Wei, Maarten Bosma, Vincent Y. Zhao, Kelvin Guu, Adams Wei Yu, Brian Lester, Nan Du, Andrew M. Dai, Quoc V. Le. [pdf], [project], 2021.09,

    • finetuning language models on a collection of tasks described via instructions
    • substantially improves zero-shot performance on unseen tasks
  9. Active Example Selection for In-Context Learning.

    Yiming Zhang, Shi Feng, Chenhao Tan. [pdf], [project], 2022.11,

  10. Prompting GPT-3 To Be Reliable

    Chenglei Si, Zhe Gan, Zhengyuan Yang, Shuohang Wang, Jianfeng Wang, Jordan Boyd-Graber, Lijuan Wang. [pdf], [project], 2022.10,

  11. An lnformation-theoretic Approach to Prompt Engineering Without Ground Truth Labels

    Taylor Sorensen, Joshua Robinson, Christopher Rytting, Alexander Shaw, Kyle Rogers, Alexia Delorey, Mahmoud Khalil, Nancy Fulda, David Wingate. [pdf], 2022.5,

  12. Self-adaptive In-context Learning

    Zhiyong Wu, Yaoxiang Wang, Jiacheng Ye, Lingpeng Kong. [pdf], [project], 2022.12,

  13. Demystifying Prompts in Language Models via Perplexity Estimation

    Hila Gonen, Srini Iyer, Terra Blevins, Noah A. Smith, Luke Zettlemoyer. [pdf], [project], 2022.12,

  14. Structured Prompting: Scaling In-Context Learning to 1,000 Examples.

    Yaru Hao, Yutao Sun, Li Dong, Zhixiong Han, Yuxian Gu, Furu Wei. [pdf], [project], 2022.12.

  15. Fantastically Ordered Prompts and Where to Find Them: Overcoming Few-Shot Prompt Order Sensitivity.

Yao Lu, Max Bartolo, Alastair Moore, Sebastian Riedel, Pontus Stenetorp. [pdf], 2021.04,

  1. On the Relation between Sensitivity and Accuracy in In-context Learning.

Yanda Chen, Chen Zhao, Zhou Yu, Kathleen McKeown, He He. [pdf], 2022.09,

  1. Can language models learn from explanations in context?.

Andrew K. Lampinen, Ishita Dasgupta, Stephanie C. Y. Chan, Kory Matthewson, Michael Henry Tessler, Antonia Creswell, James L. McClelland, Jane X. Wang, Felix Hill. [pdf], 2022.04

  1. Prototypical Calibration for Few-shot Learning of Language Models Zhixiong Han, Yaru Hao, Li Dong, Furu Wei. [pdf], [project], 2022.05.

  2. Cross-Task Generalization via Natural Language Crowdsourcing Instructions.

Swaroop Mishra, Daniel Khashabi, Chitta Baral, Hannaneh Hajishirzi. [pdf], [project], 2022.03

Analysis of ICL

This section contains the pilot works that might contributes to the influence factors and working mechanism analysis of ICL.

Influence Factors for ICL

  1. Rethinking the Role of Demonstrations: What Makes In-Context Learning Work? img

    Sewon Min, Xinxi Lyu, Ari Holtzman, Mikel Artetxe, Mike Lewis, Hannaneh Hajishirzi, Luke Zettlemoyer. [pdf], [project], 2022.03, img img img

  2. What Makes Good In-Context Examples for GPT-3? img

    Jiachang Liu, Dinghan Shen, Yizhe Zhang, Bill Dolan, Lawrence Carin, Weizhu Chen. [pdf], 2022.08, img img img

  3. Emergent Abilities of Large Language Models

    Jason Wei, Yi Tay, Rishi Bommasani, Colin Raffel, Barret Zoph, Sebastian Borgeaud, Dani Yogatama, Maarten Bosma, Denny Zhou, Donald Metzler, Ed H. Chi, Tatsunori Hashimoto, Oriol Vinyals, Percy Liang, Jeff Dean, William Fedus. [pdf], 2022.07,

  4. Ground-Truth Labels Matter: A Deeper Look into Input-Label Demonstrations

    Junyeob Kim, Hyuhng Joon Kim, Hyunsoo Cho, Hwiyeol Jo, Sang-Woo Lee, Sang-goo Lee, Kang Min Yoo, Taeuk Kim. [pdf], 2022.05,

  5. On the Effect of Pretraining Corpora on In-context Learning by a Large-scale Language Model

    Seongjin Shin, Sang-Woo Lee, Hwijeen Ahn, Sungdong Kim, HyoungSeok Kim, Boseop Kim, Kyunghyun Cho, Gichang Lee, Woo-Myoung Park, Jung-Woo Ha, Nako Sung. [pdf], 2022.08,

  6. Rethinking the Role of Scale for In-Context Learning: An Interpretability-based Case Study at 66 Billion Scale

    Hritik Bansal, Karthik Gopalakrishnan, Saket Dingliwal, Sravan Bodapati, Katrin Kirchhoff, Dan Roth. [pdf], [project], 2022.12,

  7. Data Distributional Properties Drive Emergent In-Context Learning in Transformers

    Stephanie C.Y. Chan, Adam Santoro, Andrew K. Lampinen, Jane X. Wang, Aaditya Singh, Pierre H. Richemond, Jay McClelland, Felix Hill. [pdf], [project], 2022.05,

  8. Diverse Demonstrations Improve In-context Compositional Generalization

    Itay Levy, Ben Bogin, Jonathan Berant. [pdf], [project], 2022.12,

  9. Towards Understanding Chain-of-Thought Prompting: An Empirical Study of What Matters

    Boshi Wang, Sewon Min, Xiang Deng, Jiaming Shen, You Wu, Luke Zettlemoyer, Huan Sun. [pdf], [project], 2022.12,

Working Mechanism of ICL

  1. An Explanation of In-context Learning as Implicit Bayesian Inference

    Sang Michael Xie, Aditi Raghunathan, Percy Liang, Tengyu Ma. [pdf], [project], 2022.08,

  2. In-context Learning and Induction Heads

    Catherine Olsson, Nelson Elhage, Neel Nanda, Nicholas Joseph, Nova DasSarma, Tom Henighan, Ben Mann, Amanda Askell, Yuntao Bai, Anna Chen, Tom Conerly, Dawn Drain, Deep Ganguli, Zac Hatfield-Dodds, Danny Hernandez, Scott Johnston, Andy Jones, Jackson Kernion, Liane Lovitt, Kamal Ndousse, Dario Amodei, Tom Brown, Jack Clark, Jared Kaplan, Sam McCandlish, Chris Olah. [pdf], 2022.10,

  3. What Can Transformers Learn In-Context? A Case Study of Simple Function Classes

    Shivam Garg, Dimitris Tsipras, Percy Liang, Gregory Valiant. [pdf], 2022.08,

  4. Data Distributional Properties Drive Emergent In-Context Learning in Transformers

    Stephanie C. Y. Chan, Adam Santoro, Andrew K. Lampinen, Jane X. Wang, Aaditya Singh, Pierre H. Richemond, Jay McClelland, Felix Hill. [pdf], [project], 2022.05,

  5. What learning algorithm is in-context learning? Investigations with linear models

    Ekin Akyürek, Dale Schuurmans, Jacob Andreas, Tengyu Ma, Denny Zhou. [pdf], 2022.11,

  6. Transformers learn in-context by gradient descent

    von Oswald, Johannes, Eyvind Niklasson, Ettore Randazzo, João Sacramento, Alexander Mordvintsev, Andrey Zhmoginov, Max Vladymyrov. [pdf], 2022.12,

  7. Why Can GPT Learn In-Context? Language Models Secretly Perform Gradient Descent as Meta-Optimizers

    Damai Dai, Yutao Sun, Li Dong, Yaru Hao, Zhifang Sui, Furu Wei. [pdf], [project], 2022.12

  8. Transformers as Algorithms: Generalization and Implicit Model Selection in In-context Learning

    Yingcong Li, M. Emrullah Ildiz, Dimitris S. Papailiopoulos, Samet Oymak. [pdf], 2023.1

Evaluation and Resources

This section contains the pilot works that might contributes to the evaluation or resources of ICL.

  1. Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models.

    Aarohi Srivastava, Abhinav Rastogi, Abhishek Rao, Abu Awal Md Shoeb, Abubakar Abid, Adam Fisch, Adam R. Brown, Adam Santoro, Aditya Gupta, Adrià Garriga-Alonso, Agnieszka Kluska, Aitor Lewkowycz, Akshat Agarwal, Alethea Power, Alex Ray, Alex Warstadt et. al.. [pdf], [project], 2022.06,

  2. SUPER-NATURALINSTRUCTIONS: Generalization via Declarative Instructions on 1600+ NLP Task.

    Yizhong Wang, Swaroop Mishra, Pegah Alipoormolabashi, Yeganeh Kordi, Amirreza Mirzaei, Anjana Arunkumar, Arjun Ashok, Arut Selvan Dhanasekaran, Atharva Naik, David Stap, Eshaan Pathak, Giannis Karamanolakis, Haizhi Gary Lai, Ishan Purohit et. al.. [pdf], [project], 2022.04,

  3. Language Models are Multilingual Chain-of-Thought Reasoners.

    Freda Shi, Mirac Suzgun, Markus Freitag, Xuezhi Wang, Suraj Srivats, Soroush Vosoughi, Hyung Won Chung, Yi Tay, Sebastian Ruder, Denny Zhou, Dipanjan Das, Jason Wei. [pdf], 2022.10,

    • evaluate the reasoning abilities of large language models in multilingual settings, introduce the Multilingual Grade School Math (MGSM) benchmark, by manually translating 250 grade-school math problems from the GSM8K dataset into ten typologically diverse languages.
  4. Instruction Induction: From Few Examples to Natural Language Task Descriptions.

    Or Honovich, Uri Shaham, Samuel R. Bowman, Omer Levy. [pdf], [project], 2022.05,

    • how to learn task instructions from input output demonstrations
  5. Language Models Are Greedy Reasoners: A Systematic Formal Analysis of Chain-of-Thought2022.10.3

  6. What is Not in the Context? Evaluation of Few-shot Learners with Informative Demonstrations 2212.01692.pdf (arxiv.org)

  7. Unnatural Instructions: Tuning Language Models with (Almost) No Human Labor.

    Or Honovich, Thomas Scialom, Omer Levy, Timo Schick. [pdf], [project], 2022.12,

  8. Self-Instruct: Aligning Language Model with Self Generated Instructions.

    Yizhong Wang, Yeganeh Kordi, Swaroop Mishra, Alisa Liu, Noah A. Smith, Daniel Khashabi, Hannaneh Hajishirzi. [pdf], [project], 2022.12,

  9. The Flan Collection: Designing Data and Methods for Effective.

    Shayne Longpre, Le Hou, Tu Vu, Albert Webson, Hyung Won Chung, Yi Tay, Denny Zhou, Quoc V. Le, Barret Zoph, Jason Wei, Adam Roberts. [pdf], [project], 2023.1,

Application

This section contains the pilot works that expands the application of ICL.

  1. Meta-learning via Language Model In-context Tuning.

    Yanda Chen, Ruiqi Zhong, Sheng Zha, George Karypis, He He. [pdf], [project], 2021.10,

  2. Does GPT-3 Generate Empathetic Dialogues? A Novel In-Context Example Selection Method and Automatic Evaluation Metric for Empathetic Dialogue Generation.

    Young-Jun Lee, Chae-Gyun Lim, Ho-Jin Choi. [pdf], 2022.10,

  3. In-context Learning Distillation: Transferring Few-shot Learning Ability of Pre-trained Language Models.

    Yukun Huang, Yanda Chen, Zhou Yu, Kathleen McKeown. [pdf], 2022.12,

  4. Interleaving Retrieval with Chain-of-Thought Reasoning for Knowledge-Intensive Multi-Step Questions

Problems

This section contains the pilot works that points out the problems of ICL.

  1. The Inductive Bias of In-Context Learning: Rethinking Pretraining Example Design .

    Yoav Levine, Noam Wies, Daniel Jannai, Dan Navon, Yedid Hoshen, Amnon Shashua. [pdf], 2021.10,

Challenges and Future Directions

This section contains the pilot works that might contributes to the challenges and future directions of ICL.

Blogs

SEO is Dead, Long Live LLMO

How does in-context learning work? A framework for understanding the differences from traditional supervised learning

Extrapolating to Unnatural Language Processing with GPT-3's In-context Learning: The Good, the Bad, and the Mysterious

More Efficient In-Context Learning with GLaM

Contribution

Please feel free to contribute and promote your awesome work or other related works here! If you recommend related works on ICL or make contributions on this repo, please provide your information (name, homepage) and we will add you to the contributor list😊.

Contributor list

We thank Damai Dai, Qingxiu Dong, Lei Li, Shihao Liang, Li Dong for their repo contribution and paper recommendation.

Reference

Some papers are discussed in the following paper:

@misc{dong2022survey,
      title={A Survey for In-context Learning}, 
      author={Qingxiu Dong and Lei Li and Damai Dai and Ce Zheng and Zhiyong Wu and Baobao Chang and Xu Sun and Jingjing Xu and Lei Li and Zhifang Sui},
      year={2022},
      eprint={2301.00234},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}