huggingface · souzatharsis · Dec 15, 2024 · Dec 15, 2024
diff --git a/2_preference_alignment/notebooks/smolk12/README.md b/2_preference_alignment/notebooks/smolk12/README.md
@@ -0,0 +1,5 @@
+In this case study, we demonstrate how to use DPO to align a language model to a user-provided policy further automating the process via synthetic data generation and LLM-as-judge evaluation.
+
+We go over a Case Study for Acme Inc., a company dedicated to democratizing access to computer science education for K-12 students. Acme Inc. is in the process of creating a chatbot named smolK-12, a small open source LLM, specifically designed for K-12 students.
+
+We’ll explore how to align a language model with Acme Inc.’s policy to ensure its LLM-powered applications are safe and appropriate for K-12 students.