by Fanjia Yan Papers
2 papers found
Conference
LiveCodeBench: Holistic and Contamination Free Evaluation of Large Language Models for Code
Naman Jain, Han, Alex Gu et al.
ICLR 2025arXiv:2403.07974
1108
citations
The Berkeley Function Calling Leaderboard (BFCL): From Tool Use to Agentic Evaluation of Large Language Models
Shishir G. Patil, Huanzhi Mao, Fanjia Yan et al.
ICML 2025spotlight