Paper "language model evaluation" Papers
3 papers found
Conference
Pretraining on the Test Set Is No Longer All You Need: A Debate-Driven Approach to QA Benchmarks
Linbo Cao, Jinman Zhao
COLM 2025paperarXiv:2507.17747
3
citations
LatestEval: Addressing Data Contamination in Language Model Evaluation through Dynamic and Time
Sensitive Test Construction - Yucheng Li, Frank Guerin, Chenghua Lin
AAAI 2024paperarXiv:2312.12343
54
citations
Task Contamination: Language Models May Not Be Few-Shot Anymore
Changmao Li, Jeffrey Flanigan
AAAI 2024paperarXiv:2312.16337
132
citations