A paper list about large language models and multi-modal models.
Note: It only records papers for my personal needs. It is welcome to open an issue if you think I missed some important or exciting work!
- HELM: Holistic evaluation of language models. TMLR. paper
- HEIM: Holistic Evaluation of Text-to-Image Models. NeurIPS'2023. paper
- Eval Survey: A Survey on Evaluation of Large Language Models. Arxiv'2023. paper
- Healthcare LM Survey: A Survey of Large Language Models for Healthcare: from Data, Technology, and Applications to Accountability and Ethics. Arxiv'2023. paper, github
- Multimodal LLM Survey: A Survey on Multimodal Large Language Model. Arxiv'2023. paper, github
- VLM for vision Task Survey: Vision Language Models for Vision Tasks: A Survey. Arxiv'2023. paper, github
- Efficient LLM Survey: Efficient Large Language Models: A Survey. Arxiv'2023. paper, github
- Prompt Engineering Survey: Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing. Arxiv'2021. paper
- Multimodal Safety Survey: Safety of Multimodal Large Language Models on Images and Text. Arxiv'2024. paper
- Multimodal LLM Recent Survey: MM-LLMs: Recent Advances in MultiModal Large Language Models. Arxiv'2024. paper
- Prompt Engineering in LLM Survey: A Systematic Survey of Prompt Engineering in Large Language Models: Techniques and Applications. Arxiv'2024. paper
- LLM Security and Privacy Survey: A Survey on Large Language Model (LLM) Security and Privacy: The Good, the Bad, and the Ugly. Arxiv'2024. paper
- LLM Privacy Survey: Privacy in Large Language Models: Attacks, Defenses and Future Directions. Arxiv'2023. paper
- Transformer: Attention Is All You Need. NIPS'2017. paper
- GPT-1: Improving Language Understanding by Generative Pre-Training. 2018. paper
- BERT: BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. NAACL'2019. paper
- GPT-2: Language Models are Unsupervised Multitask Learners. 2018. paper
- RoBERTa: RoBERTa: A Robustly Optimized BERT Pretraining Approach. Arxiv'2019, paper
- DistilBERT: DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter. Arxiv'2019. paper
- T5: Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer. JMLR'2020. paper
- GPT-3: Language Models are Few-Shot Learners. NeurIPS'2020. paper
- GLaM: GLaM: Efficient Scaling of Language Models with Mixture-of-Experts. ICML'2022. paper
- PaLM: PaLM: Scaling Language Modeling with Pathways. ArXiv'2022. paper
- BLOOM: BLOOM: A 176B-Parameter Open-Access Multilingual Language Model. Arxiv'2022. paper
- BLOOMZ: Crosslingual Generalization through Multitask Finetuning. Arxiv'2023. paper
- LLaMA: LLaMA: Open and Efficient Foundation Language Models. Arxiv'2023. paper
- GPT-4: GPT-4 Technical Report. Arxiv'2023. paper
- PaLM 2: PaLM 2 Technical Report. 2023. paper
- LLaMA 2: Llama 2: Open foundation and fine-tuned chat models. Arxiv'2023. paper
- Mistral: Mistral 7B. Arxiv'2023. paper
- Phi1: Project Link
- Phi1.5: Project Link
- Phi2: Project Link
- Falcon: Project Link
- PPO: Proximal Policy Optimization Algorithms. Arxiv'2017. paper
- DPO: Direct Preference Optimization: Your Language Model is Secretly a Reward Model. NeurIPS'2023. paper
- LoRA: LoRA: Low-Rank Adaptation of Large Language Models. Arxiv'2021. paper
- Q-LoRA: QLoRA: Efficient Finetuning of Quantized LLMs. NeurIPS'2023. paper
- Med-PaLM: Large Language Models Encode Clinical Knowledge. Arxiv'2022. paper
- MedAlpaca: MedAlpaca -- An Open-Source Collection of Medical Conversational AI Models and Training Data. Arxiv'2023. paper
- Med-PaLM 2: Towards Expert-Level Medical Question Answering with Large Language Models. Arxiv'2023. paper
- HuatuoGPT: HuatuoGPT, towards Taming Language Model to Be a Doctor. EMNLP'2023(findings). paper
- GPT-4-Med: Capabilities of GPT-4 on Medical Challenge Problems. Arxiv'2023. paper
- PET: Exploiting Cloze Questions for Few Shot Text Classification and Natural Language Inference. EACL'2021. paper
- Making Pre-trained Language Models Better Few-shot Learners. ACL'2021. paper
- Prompt-Tuning:The Power of Scale for Parameter-Efficient Prompt Tuning. EMNLP'2021 [paper]
- Prefix-Tuning: Prefix-Tuning: Optimizing Continuous Prompts for Generation. ACL'2021. paper
- P-tuning: P-Tuning: Prompt Tuning Can Be Comparable to Fine-tuning Across Scales and Tasks. ACL'2022. paper
- P-tuning v2: P-Tuning v2: Prompt Tuning Can Be Comparable to Fine-tuning Universally Across Scales and Tasks. Arxiv'2022. Paper
- Auto-Prompt: AutoPrompt: Eliciting Knowledge from Language Models with Automatically Generated Prompts. EMNLP'2020. paper
- FluentPrompt: Toward Human Readable Prompt Tuning: Kubrick's The Shining is a good movie, and a good prompt too?. EMNLP'2023 (findings). paper
- PEZ: Hard prompts made easy: Gradient-based discrete optimization for prompt tuning and discovery. Arxiv'2023. paper
- CLIP: Learning Transferable Visual Models From Natural Language Supervision. ICML'2021. paper
- DeCLIP: Supervision Exists Everywhere: A Data Efficient Contrastive Language-Image Pre-training Paradigm. ICLR'2022. paper
- FILIP: FILIP: Fine-grained Interactive Language-Image Pre-Training. ICLR'2022. paper
- Stable Diffusion: High-Resolution Image Synthesis with Latent Diffusion Models. CVPR'2022. paper
- BLIP: BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation. ICML'2022. paper
- BLIP2: BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models. ICML'2023. paper
- LLaMA-Adapter: LLaMA-Adapter: Efficient Fine-tuning of Language Models with Zero-init Attention. Arxiv'2023. paper
- LLaVA: Visual Instruction Tuning. NeurIPS'2023. paper
- Instruct BLIP: InstructBLIP: Towards General-purpose Vision-Language Models with Instruction Tuning. NeurIPS'2023. paper
- SLD: Safe Latent Diffusion: Mitigating Inappropriate Degeneration in Diffusion Models. CVPR'2023. paper
- ESD: Erasing Concepts from Diffusion Models. ICCV'2023. paper
- POPE: Evaluating Object Hallucination in Large Vision-Language Models. EMNLP'2023. paper
- HallusionBench: An Advanced Diagnostic Suite for Entangled Language Hallucination and Visual Illusion in Large Vision-Language Models. CVPR'2024. paper
- Stanford Town: Generative Agents: Interactive Simulacra of Human Behavior. UIST'2023. paper
- OSWorld: OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments Arxiv'2024. paper
- Hugging Face course. https://huggingface.co/learn
- LLaMA Factory. https://github.com/hiyouga/LLaMA-Factory
- DeepSpeed. https://github.com/microsoft/DeepSpeed
- trlx. https://github.com/CarperAI/trlx
- Prompt Engineering Update. https://github.com/thunlp/PromptPapers