"reasoning benchmarks" Papers

14 papers found

Accelerated Sampling from Masked Diffusion Models via Entropy Bounded Unmasking

Heli Ben-Hamu, Itai Gat, Daniel Severo et al.

NEURIPS 2025arXiv:2505.24857
54
citations

Activation Control for Efficiently Eliciting Long Chain-of-thought Ability of Language Models

Zekai Zhao, Qi Liu, Kun Zhou et al.

NEURIPS 2025spotlightarXiv:2505.17697
7
citations

As Simple as Fine-tuning: LLM Alignment via Bidirectional Negative Feedback Loss

Xin Mao, Huimin Xu, Feng-Lin Li et al.

ICLR 2025arXiv:2410.04834
3
citations

HELM: Hyperbolic Large Language Models via Mixture-of-Curvature Experts

Neil He, Rishabh Anand, Hiren Madhu et al.

NEURIPS 2025arXiv:2505.24722
9
citations

KLASS: KL-Guided Fast Inference in Masked Diffusion Models

Seo Hyun Kim, Sunwoo Hong, Hojung Jung et al.

NEURIPS 2025spotlightarXiv:2511.05664
6
citations

Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach

Jonas Geiping, Sean McLeish, Neel Jain et al.

NEURIPS 2025spotlightarXiv:2502.05171
158
citations

SCOUT: Teaching Pre-trained Language Models to Enhance Reasoning via Flow Chain-of-Thought

Guanghao Li, Wenhao Jiang, Mingfeng Chen et al.

NEURIPS 2025arXiv:2505.24181
3
citations

SeRL: Self-play Reinforcement Learning for Large Language Models with Limited Data

Wenkai Fang, Shunyu Liu, Yang Zhou et al.

NEURIPS 2025arXiv:2505.20347
25
citations

SpecReason: Fast and Accurate Inference-Time Compute via Speculative Reasoning

Rui Pan, Yinwei Dai, Zhihao Zhang et al.

NEURIPS 2025arXiv:2504.07891
37
citations

SPMDM: Enhancing Masked Diffusion Models through Simplifing Sampling Path

Yichen Zhu, Weiyu Chen, James Kwok et al.

NEURIPS 2025

SwS: Self-aware Weakness-driven Problem Synthesis in Reinforcement Learning for LLM Reasoning

Xiao Liang, Zhong-Zhi Li, Yeyun Gong et al.

NEURIPS 2025arXiv:2506.08989
16
citations

WizardMath: Empowering Mathematical Reasoning for Large Language Models via Reinforced Evol-Instruct

Haipeng Luo, Qingfeng Sun, Can Xu et al.

ICLR 2025arXiv:2308.09583
655
citations

Keypoint-based Progressive Chain-of-Thought Distillation for LLMs

Kaituo Feng, Changsheng Li, Xiaolu Zhang et al.

ICML 2024arXiv:2405.16064
16
citations

Promptbreeder: Self-Referential Self-Improvement via Prompt Evolution

Chrisantha Fernando, Dylan Banarse, Henryk Michalewski et al.

ICML 2024arXiv:2309.16797
364
citations