by Ahmed H Awadallah Papers
3 papers found
Conference
Exploratory Preference Optimization: Harnessing Implicit Q*-Approximation for Sample-Efficient RLHF
Tengyang Xie, Dylan Foster, Akshay Krishnamurthy et al.
ICLR 2025arXiv:2405.21046
79
citations
Hybrid LLM: Cost-Efficient and Quality-Aware Query Routing
Dujian Ding, Ankur Mallick, Chi Wang et al.
ICLR 2024arXiv:2404.14618
208
citations
Teaching Language Models to Hallucinate Less with Synthetic Tasks
Erik Jones, Hamid Palangi, Clarisse Ribeiro et al.
ICLR 2024arXiv:2310.06827
41
citations