by W. Bradley Knox Papers
2 papers found
Conference
Modeling Future Conversation Turns to Teach LLMs to Ask Clarifying Questions
Michael Zhang, W. Bradley Knox, Eunsol Choi
ICLR 2025arXiv:2410.13788
37
citations
Contrastive Preference Learning: Learning from Human Feedback without Reinforcement Learning
Joey Hejna, Rafael Rafailov, Harshit Sikchi et al.
ICLR 2024