"code generation evaluation" Papers
3 papers found
Conference
LiveCodeBench: Holistic and Contamination Free Evaluation of Large Language Models for Code
Naman Jain, Han, Alex Gu et al.
ICLR 2025arXiv:2403.07974
1108
citations
McEval: Massively Multilingual Code Evaluation
Linzheng Chai, Shukai Liu, Jian Yang et al.
ICLR 2025arXiv:2406.07436
31
citations
Rethinking Verification for LLM Code Generation: From Generation to Testing
Zihan Ma, Taolin Zhang, Maosongcao et al.
NEURIPS 2025arXiv:2507.06920
7
citations