Paper collection on building and evaluating language model agents via executable language grounding
-
Updated
Apr 29, 2024
Paper collection on building and evaluating language model agents via executable language grounding
Code & data for ICLR 2024 spotlight paper: 🍯MUSTARD: Mastering Uniform Synthesis of Theorem and Proof Data
RECKONING is a bi-level learning algorithm that improves language models' reasoning ability by folding contextual knowledge into parametric knowledge through back-propagation.
RUPBench: Benchmarking Reasoning Under Perturbations for Robustness Evaluation in Large Language Models
Add a description, image, and links to the complex-reasoning topic page so that developers can more easily learn about it.
To associate your repository with the complex-reasoning topic, visit your repo's landing page and select "manage topics."