rleak.com - Spot the Future of AI Research

#1

MME: A Comprehensive Evaluation Benchmark for Multimodal Large Language Models

Chaoyou Fu, Peixian Chen, Yunhang Shen et al.

NEURIPS 2025

1,277

citations

#2

DAPO: An Open-Source LLM Reinforcement Learning System at Scale

Qiying Yu, Zheng Zhang, Ruofei Zhu et al.

NEURIPS 2025

1,211

citations

#3

YOLOv12: Attention-Centric Real-Time Object Detectors

Yunjie Tian, Qixiang Ye, DAVID DOERMANN

NEURIPS 2025

938

citations

#4

Does Reinforcement Learning Really Incentivize Reasoning Capacity in LLMs Beyond the Base Model?

Yang Yue, Zhiqi Chen, Rui Lu et al.

NEURIPS 2025

540

citations

#5

Gymnasium: A Standard Interface for Reinforcement Learning Environments

Mark Towers, Ariel Kwiatkowski, John Balis et al.

NEURIPS 2025

534

citations

#6

Large Language Diffusion Models

Shen Nie, Fengqi Zhu, Zebin You et al.

NEURIPS 2025

403

citations

#7

Open-Reasoner-Zero: An Open Source Approach to Scaling Up Reinforcement Learning on the Base Model

Jingcheng Hu, Yinmin Zhang, Qi Han et al.

NEURIPS 2025

347

citations

#8

Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning

Shenzhi Wang, Le Yu, Chang Gao et al.

NEURIPS 2025

305

citations

#9

The Illusion of Thinking: Understanding the Strengths and Limitations of Reasoning Models via the Lens of Problem Complexity

Parshin Shojaee, Iman Mirzadeh, Keivan Alizadeh vahid et al.

NEURIPS 2025

277

citations

#10

Video-R1: Reinforcing Video Reasoning in MLLMs

Kaituo Feng, Kaixiong Gong, Bohao Li et al.

NEURIPS 2025

257

citations

#11

A-Mem: Agentic Memory for LLM Agents

Wujiang Xu, Zujie Liang, Kai Mei et al.

NEURIPS 2025

250

citations

#12

Flow-GRPO: Training Flow Matching Models via Online RL

Jie Liu, Gongye Liu, Jiajun Liang et al.

NEURIPS 2025

221

citations

#13

Why Do Multi-Agent LLM Systems Fail?

Mert Cemri, Melissa Z Pan, Shuyi Yang et al.

NEURIPS 2025

204

citations

#14

WebThinker: Empowering Large Reasoning Models with Deep Research Capability

Xiaoxi Li, Jiajie Jin, Guanting Dong et al.

NEURIPS 2025

198

citations

#15

Reinforcement Learning for Reasoning in Large Language Models with One Training Example

Yiping Wang, Qing Yang, Zhiyuan Zeng et al.

NEURIPS 2025

190

citations

#16

Mean Flows for One-step Generative Modeling

Zhengyang Geng, Mingyang Deng, Xingjian Bai et al.

NEURIPS 2025

185

citations

#17

VL-Rethinker: Incentivizing Self-Reflection of Vision-Language Models with Reinforcement Learning

Haozhe Wang, Chao Qu, Zuming Huang et al.

NEURIPS 2025

183

citations

#18

Training Language Models to Reason Efficiently

Daman Arora, Andrea Zanette

NEURIPS 2025

178

citations

#19

ToolRL: Reward is All Tool Learning Needs

Cheng Qian, Emre Can Acikgoz, Qi He et al.

NEURIPS 2025

178

citations

#20

Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach

Jonas Geiping, Sean McLeish, Neel Jain et al.

NEURIPS 2025

158

citations

NEURIPS

Top Papers in NEURIPS 2025

MME: A Comprehensive Evaluation Benchmark for Multimodal Large Language Models

DAPO: An Open-Source LLM Reinforcement Learning System at Scale

YOLOv12: Attention-Centric Real-Time Object Detectors

Does Reinforcement Learning Really Incentivize Reasoning Capacity in LLMs Beyond the Base Model?

Gymnasium: A Standard Interface for Reinforcement Learning Environments

Large Language Diffusion Models

Open-Reasoner-Zero: An Open Source Approach to Scaling Up Reinforcement Learning on the Base Model

Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning

The Illusion of Thinking: Understanding the Strengths and Limitations of Reasoning Models via the Lens of Problem Complexity

Video-R1: Reinforcing Video Reasoning in MLLMs

A-Mem: Agentic Memory for LLM Agents

Flow-GRPO: Training Flow Matching Models via Online RL

Why Do Multi-Agent LLM Systems Fail?

WebThinker: Empowering Large Reasoning Models with Deep Research Capability

Reinforcement Learning for Reasoning in Large Language Models with One Training Example

Mean Flows for One-step Generative Modeling

VL-Rethinker: Incentivizing Self-Reflection of Vision-Language Models with Reinforcement Learning

Training Language Models to Reason Efficiently

ToolRL: Reward is All Tool Learning Needs

Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach