Poster "pairwise comparison" Papers
2 papers found
Conference
Bridging Human and LLM Judgments: Understanding and Narrowing the Gap
Felipe Maia Polo, Xinhe Wang, Mikhail Yurochkin et al.
NEURIPS 2025arXiv:2508.12792
1
citations
Chatbot Arena: An Open Platform for Evaluating LLMs by Human Preference
Wei-Lin Chiang, Lianmin Zheng, Ying Sheng et al.
ICML 2024arXiv:2403.04132
1026
citations