Poster "reasoning benchmarks" Papers

11 papers found

Accelerated Sampling from Masked Diffusion Models via Entropy Bounded Unmasking

Heli Ben-Hamu, Itai Gat, Daniel Severo et al.

NEURIPS 2025arXiv:2505.24857
47
citations

As Simple as Fine-tuning: LLM Alignment via Bidirectional Negative Feedback Loss

Xin Mao, Huimin Xu, Feng-Lin Li et al.

ICLR 2025arXiv:2410.04834
3
citations

HELM: Hyperbolic Large Language Models via Mixture-of-Curvature Experts

Neil He, Rishabh Anand, Hiren Madhu et al.

NEURIPS 2025arXiv:2505.24722
8
citations

SCOUT: Teaching Pre-trained Language Models to Enhance Reasoning via Flow Chain-of-Thought

Guanghao Li, Wenhao Jiang, Mingfeng Chen et al.

NEURIPS 2025arXiv:2505.24181
3
citations

SeRL: Self-play Reinforcement Learning for Large Language Models with Limited Data

Wenkai Fang, Shunyu Liu, Yang Zhou et al.

NEURIPS 2025arXiv:2505.20347
19
citations

SpecReason: Fast and Accurate Inference-Time Compute via Speculative Reasoning

Rui Pan, Yinwei Dai, Zhihao Zhang et al.

NEURIPS 2025arXiv:2504.07891
37
citations

SPMDM: Enhancing Masked Diffusion Models through Simplifing Sampling Path

Yichen Zhu, Weiyu Chen, James Kwok et al.

NEURIPS 2025

SwS: Self-aware Weakness-driven Problem Synthesis in Reinforcement Learning for LLM Reasoning

Xiao Liang, Zhong-Zhi Li, Yeyun Gong et al.

NEURIPS 2025arXiv:2506.08989
14
citations

WizardMath: Empowering Mathematical Reasoning for Large Language Models via Reinforced Evol-Instruct

Haipeng Luo, Qingfeng Sun, Can Xu et al.

ICLR 2025arXiv:2308.09583
644
citations

Keypoint-based Progressive Chain-of-Thought Distillation for LLMs

Kaituo Feng, Changsheng Li, Xiaolu Zhang et al.

ICML 2024arXiv:2405.16064

Promptbreeder: Self-Referential Self-Improvement via Prompt Evolution

Chrisantha Fernando, Dylan Banarse, Henryk Michalewski et al.

ICML 2024arXiv:2309.16797