Poster "model evaluation" Papers
9 papers found
Conference
Automated Generation of Challenging Multiple-Choice Questions for Vision Language Model Evaluation
Yuhui Zhang, Yuchang Su, Yiming Liu et al.
CVPR 2025arXiv:2501.03225
23
citations
Automated Model Discovery via Multi-modal & Multi-step Pipeline
Lee Jung-Mok, Nam Hyeon-Woo, Moon Ye-Bin et al.
NEURIPS 2025arXiv:2509.25946
Fine-tuning can Help Detect Pretraining Data from Large Language Models
Hengxiang Zhang, Songxin Zhang, Bingyi Jing et al.
ICLR 2025arXiv:2410.10880
4
citations
Law of the Weakest Link: Cross Capabilities of Large Language Models
Ming Zhong, Aston Zhang, Xuewei Wang et al.
ICLR 2025arXiv:2409.19951
12
citations
SelectFormer in Data Markets: Privacy-Preserving and Efficient Data Selection for Transformers with Multi-Party Computation
Xu Ouyang, Felix Xiaozhu Lin, Yangfeng Ji
ICLR 2025
Do Large Language Models Perform the Way People Expect? Measuring the Human Generalization Function
Keyon Vafa, Ashesh Rambachan, Sendhil Mullainathan
ICML 2024arXiv:2406.01382
25
citations
Feedback Loops With Language Models Drive In-Context Reward Hacking
Alexander Pan, Erik Jones, Meena Jagadeesan et al.
ICML 2024arXiv:2402.06627
60
citations
Interplay of ROC and Precision-Recall AUCs: Theoretical Limits and Practical Implications in Binary Classification
Martin Mihelich, François Castagnos, Charles Dognin
ICML 2024
Rethinking Generative Large Language Model Evaluation for Semantic Comprehension
Fangyun Wei, Xi Chen, Lin Luo
ICML 2024arXiv:2403.07872
13
citations