rleak.com - Spot the Future of AI Research

#1

WorldSimBench: Towards Video Generation Models as World Simulators

Yiran Qin, Zhelun Shi, Jiwen Yu et al.

ICML 2025

842

citations

#2

SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training

Tianzhe Chu, Yuexiang Zhai, Jihan Yang et al.

ICML 2025

442

citations

#3

From Crowdsourced Data to High-quality Benchmarks: Arena-Hard and Benchbuilder Pipeline

Tianle Li, Wei-Lin Chiang, Evan Frick et al.

ICML 2025

357

citations

#4

rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking

Xinyu Guan, Li Lyna Zhang, Yifei Liu et al.

ICML 2025

268

citations

#5

SparseVLM: Visual Token Sparsification for Efficient Vision-Language Model Inference

Yuan Zhang, Chun-Kai Fan, Junpeng Ma et al.

ICML 2025

214

citations

#6

Learning to (Learn at Test Time): RNNs with Expressive Hidden States

Yu Sun, Xinhao Li, Karan Dalal et al.

ICML 2025

199

citations

#7

LongVU: Spatiotemporal Adaptive Compression for Long Video-Language Understanding

Xiaoqian Shen, Yunyang Xiong, Changsheng Zhao et al.

ICML 2025

184

citations

#8

Aguvis: Unified Pure Vision Agents for Autonomous GUI Interaction

Yiheng Xu, Zekun Wang, Junli Wang et al.

ICML 2025

182

citations

#9

MimicMotion: High-Quality Human Motion Video Generation with Confidence-aware Pose Guidance

Yuang Zhang, Jiaxi Gu, Li-Wen Wang et al.

ICML 2025

161

citations

#10

Training Software Engineering Agents and Verifiers with SWE-Gym

Jiayi Pan, Xingyao Wang, Graham Neubig et al.

ICML 2025

156

citations

#11

Layer by Layer: Uncovering Hidden Representations in Language Models

Oscar Skean, Md Rifat Arefin, Dan Zhao et al.

ICML 2025

145

citations

#12

AdvPrompter: Fast Adaptive Adversarial Prompting for LLMs

Anselm Paulus, Arman Zharmagambetov, Chuan Guo et al.

ICML 2025

132

citations

#13

Imagine While Reasoning in Space: Multimodal Visualization-of-Thought

Chengzu Li, Wenshan Wu, Huanyu Zhang et al.

ICML 2025

131

citations

#14

DINO-WM: World Models on Pre-trained Visual Features enable Zero-shot Planning

Gaoyue Zhou, Hengkai Pan, Yann LeCun et al.

ICML 2025

126

citations

#15

How Far Is Video Generation from World Model: A Physical Law Perspective

Bingyi Kang, Yang Yue, Rui Lu et al.

ICML 2025

126

citations

#16

Hi Robot: Open-Ended Instruction Following with Hierarchical Vision-Language-Action Models

Lucy Xiaoyang Shi, brian ichter, Michael Equi et al.

ICML 2025

120

citations

#17

A General Framework for Inference-time Scaling and Steering of Diffusion Models

Raghav Singhal, Zachary Horvitz, Ryan Teehan et al.

ICML 2025

119

citations

#18

Taming Rectified Flow for Inversion and Editing

Jiangshan Wang, Junfu Pu, Zhongang Qi et al.

ICML 2025

119

citations

#19

AxBench: Steering LLMs? Even Simple Baselines Outperform Sparse Autoencoders

Zhengxuan Wu, Aryaman Arora, Atticus Geiger et al.

ICML 2025

118

citations

#20

Free Process Rewards without Process Labels

Lifan Yuan, Wendi Li, Huayu Chen et al.

ICML 2025

117

citations

ICML

Top Papers in ICML 2025

WorldSimBench: Towards Video Generation Models as World Simulators

SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training

From Crowdsourced Data to High-quality Benchmarks: Arena-Hard and Benchbuilder Pipeline

rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking

SparseVLM: Visual Token Sparsification for Efficient Vision-Language Model Inference

Learning to (Learn at Test Time): RNNs with Expressive Hidden States

LongVU: Spatiotemporal Adaptive Compression for Long Video-Language Understanding

Aguvis: Unified Pure Vision Agents for Autonomous GUI Interaction

MimicMotion: High-Quality Human Motion Video Generation with Confidence-aware Pose Guidance

Training Software Engineering Agents and Verifiers with SWE-Gym

Layer by Layer: Uncovering Hidden Representations in Language Models

AdvPrompter: Fast Adaptive Adversarial Prompting for LLMs

Imagine While Reasoning in Space: Multimodal Visualization-of-Thought

DINO-WM: World Models on Pre-trained Visual Features enable Zero-shot Planning

How Far Is Video Generation from World Model: A Physical Law Perspective

Hi Robot: Open-Ended Instruction Following with Hierarchical Vision-Language-Action Models

A General Framework for Inference-time Scaling and Steering of Diffusion Models

Taming Rectified Flow for Inversion and Editing

AxBench: Steering LLMs? Even Simple Baselines Outperform Sparse Autoencoders

Free Process Rewards without Process Labels