"reasoning benchmarks" Papers
14 papers found
Conference
Accelerated Sampling from Masked Diffusion Models via Entropy Bounded Unmasking
Heli Ben-Hamu, Itai Gat, Daniel Severo et al.
NEURIPS 2025arXiv:2505.24857
54
citations
Activation Control for Efficiently Eliciting Long Chain-of-thought Ability of Language Models
Zekai Zhao, Qi Liu, Kun Zhou et al.
NEURIPS 2025spotlightarXiv:2505.17697
7
citations
As Simple as Fine-tuning: LLM Alignment via Bidirectional Negative Feedback Loss
Xin Mao, Huimin Xu, Feng-Lin Li et al.
ICLR 2025arXiv:2410.04834
3
citations
HELM: Hyperbolic Large Language Models via Mixture-of-Curvature Experts
Neil He, Rishabh Anand, Hiren Madhu et al.
NEURIPS 2025arXiv:2505.24722
9
citations
KLASS: KL-Guided Fast Inference in Masked Diffusion Models
Seo Hyun Kim, Sunwoo Hong, Hojung Jung et al.
NEURIPS 2025spotlightarXiv:2511.05664
6
citations
Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach
Jonas Geiping, Sean McLeish, Neel Jain et al.
NEURIPS 2025spotlightarXiv:2502.05171
158
citations
SCOUT: Teaching Pre-trained Language Models to Enhance Reasoning via Flow Chain-of-Thought
Guanghao Li, Wenhao Jiang, Mingfeng Chen et al.
NEURIPS 2025arXiv:2505.24181
3
citations
SeRL: Self-play Reinforcement Learning for Large Language Models with Limited Data
Wenkai Fang, Shunyu Liu, Yang Zhou et al.
NEURIPS 2025arXiv:2505.20347
25
citations
SpecReason: Fast and Accurate Inference-Time Compute via Speculative Reasoning
Rui Pan, Yinwei Dai, Zhihao Zhang et al.
NEURIPS 2025arXiv:2504.07891
37
citations
SPMDM: Enhancing Masked Diffusion Models through Simplifing Sampling Path
Yichen Zhu, Weiyu Chen, James Kwok et al.
NEURIPS 2025
SwS: Self-aware Weakness-driven Problem Synthesis in Reinforcement Learning for LLM Reasoning
Xiao Liang, Zhong-Zhi Li, Yeyun Gong et al.
NEURIPS 2025arXiv:2506.08989
16
citations
WizardMath: Empowering Mathematical Reasoning for Large Language Models via Reinforced Evol-Instruct
Haipeng Luo, Qingfeng Sun, Can Xu et al.
ICLR 2025arXiv:2308.09583
655
citations
Keypoint-based Progressive Chain-of-Thought Distillation for LLMs
Kaituo Feng, Changsheng Li, Xiaolu Zhang et al.
ICML 2024arXiv:2405.16064
16
citations
Promptbreeder: Self-Referential Self-Improvement via Prompt Evolution
Chrisantha Fernando, Dylan Banarse, Henryk Michalewski et al.
ICML 2024arXiv:2309.16797
364
citations