Spotlight "instruction following" Papers
2 papers found
Conference
Checklists Are Better Than Reward Models For Aligning Language Models
Vijay Viswanathan, Yanchao Sun, Xiang Kong et al.
NEURIPS 2025spotlightarXiv:2507.18624
32
citations
Fixing It in Post: A Comparative Study of LLM Post-Training Data Quality and Model Performance
Aladin Djuhera, Swanand Kadhe, Syed Zawad et al.
NEURIPS 2025spotlightarXiv:2506.06522