"model comparison" Papers
6 papers found
Conference
A Curious Case of the Missing Measure: Better Scores and Worse Generation
Joseph Turian, Jordie Shier
ICLR 2025
Inverse decision-making using neural amortized Bayesian actors
Dominik Straub, Tobias Fabian Niehues, Jan Peters et al.
ICLR 2025oralarXiv:2409.03710
1
citations
Limits to scalable evaluation at the frontier: LLM as judge won’t beat twice the data
Florian Eddie Dorner, Vivian Nastl, Moritz Hardt
ICLR 2025
24
citations
Re-evaluating Open-ended Evaluation of Large Language Models
Si-Qi Liu, Ian Gemp, Luke Marris et al.
ICLR 2025arXiv:2502.20170
7
citations
Representational Difference Explanations
Neehar Kondapaneni, Oisin Mac Aodha, Pietro Perona
NEURIPS 2025arXiv:2505.23917
Representational Similarity via Interpretable Visual Concepts
Neehar Kondapaneni, Oisin Mac Aodha, Pietro Perona
ICLR 2025arXiv:2503.15699
3
citations