OpenAI采用LFHF技术在NLP领域的初步尝试
- Raining a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback
Anthropic团队尝试用LFHF技术解决Harmless Assistant问题
OpenAI采用LFHF技术在NLP领域的进一步尝试,也是ChatGPT的前身
- Constitutional AI: Harmlessness from AI Feedback
针对Human feedback效率低的问题,提出AI feedback方案
- Scaling Laws for Reward Model Overoptimization
分析 Reward Model 细节的文章