"multi-step reasoning" Papers

29 papers found

Accurate and Regret-Aware Numerical Problem Solver for Tabular Question Answering

Yuxiang Wang, Jianzhong Qi, Junhao Gan

AAAI 2025paperarXiv:2410.12846
11
citations

Automated Model Discovery via Multi-modal & Multi-step Pipeline

Lee Jung-Mok, Nam Hyeon-Woo, Moon Ye-Bin et al.

NEURIPS 2025arXiv:2509.25946

Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models

Clemencia Siro, Guy Gur-Ari, Gaurav Mishra et al.

ICLR 2025oralarXiv:2206.04615
2226
citations

ChatGen: Automatic Text-to-Image Generation From FreeStyle Chatting

Chengyou Jia, Changliang Xia, Zhuohang Dang et al.

CVPR 2025arXiv:2411.17176
7
citations

CofCA: A STEP-WISE Counterfactual Multi-hop QA benchmark

Jian Wu, Linyi Yang, Zhen Wang et al.

ICLR 2025arXiv:2402.11924
14
citations

CoRe: Benchmarking LLMs’ Code Reasoning Capabilities through Static Analysis Tasks

Danning Xie, Mingwei Zheng, Xuwei Liu et al.

NEURIPS 2025spotlightarXiv:2507.05269
13
citations

DiscoveryBench: Towards Data-Driven Discovery with Large Language Models

Bodhisattwa Prasad Majumder, Harshit Surana, Dhruv Agarwal et al.

ICLR 2025arXiv:2407.01725
40
citations

Enhancing Video-LLM Reasoning via Agent-of-Thoughts Distillation

Yudi Shi, Shangzhe Di, Qirui Chen et al.

CVPR 2025arXiv:2412.01694
23
citations

From Style to Facts: Mapping the Boundaries of Knowledge Injection with Finetuning

Eric Zhao, Pranjal Awasthi, Nika Haghtalab

NEURIPS 2025arXiv:2503.05919
4
citations

GFlowVLM: Enhancing Multi-step Reasoning in Vision-Language Models with Generative Flow Networks

Haoqiang Kang, Enna Sachdeva, Piyush Gupta et al.

CVPR 2025arXiv:2503.06514
8
citations

Language Models can Self-Improve at State-Value Estimation for Better Search

Ethan Mendes, Alan Ritter

NEURIPS 2025spotlightarXiv:2503.02878
4
citations

Limits of Deep Learning: Sequence Modeling through the Lens of Complexity Theory

Nikola Zubic, Federico Soldà, Aurelio Sulser et al.

ICLR 2025arXiv:2405.16674
18
citations

PHYBench: Holistic Evaluation of Physical Perception and Reasoning in Large Language Models

Shi Qiu, Shaoyang Guo, Zhuo-Yang Song et al.

NEURIPS 2025arXiv:2504.16074
30
citations

PlanU: Large Language Model Reasoning through Planning under Uncertainty

Ziwei Deng, Mian Deng, Chenjing Liang et al.

NEURIPS 2025arXiv:2510.18442

ReCAP: Recursive Context-Aware Reasoning and Planning for Large Language Model Agents

Zhenyu Zhang, Tianyi Chen, Weiran Xu et al.

NEURIPS 2025arXiv:2510.23822
3
citations

RoboRefer: Towards Spatial Referring with Reasoning in Vision-Language Models for Robotics

Enshen Zhou, Jingkun An, Cheng Chi et al.

NEURIPS 2025arXiv:2506.04308
58
citations

SE-Agent: Self-Evolution Trajectory Optimization in Multi-Step Reasoning with LLM-Based Agents

Yifu Guo, Jiaye Lin, Huacan Wang et al.

NEURIPS 2025arXiv:2508.02085
22
citations

SEEA-R1: Tree-Structured Reinforcement Fine-Tuning for Self-Evolving Embodied Agents

Wanxin Tian, Shijie Zhang, Kevin Zhang et al.

NEURIPS 2025arXiv:2506.21669
6
citations

The Lottery LLM Hypothesis, Rethinking What Abilities Should LLM Compression Preserve?

Zhenheng Tang, Xiang Liu, Qian Wang et al.

ICLR 2025arXiv:2502.17535
11
citations

Think or Not? Exploring Thinking Efficiency in Large Reasoning Models via an Information-Theoretic Lens

Xixian Yong, Xiao Zhou, Yingying Zhang et al.

NEURIPS 2025spotlightarXiv:2505.18237
15
citations

ToolVQA: A Dataset for Multi-step Reasoning VQA with External Tools

Shaofeng Yin, Ting Lei, Yang Liu

ICCV 2025arXiv:2508.03284
4
citations

Towards Trustworthy Knowledge Graph Reasoning: An Uncertainty Aware Perspective

Bo Ni, Yu Wang, Lu Cheng et al.

AAAI 2025paperarXiv:2410.08985
12
citations

Training Large Language Models for Retrieval-Augmented Question Answering through Backtracking Correction

Huawen Feng, ZekunYao, Junhao Zheng et al.

ICLR 2025
1
citations

Unveiling the Magic of Code Reasoning through Hypothesis Decomposition and Amendment

Yuze Zhao, Tianyun Ji, Wenjun Feng et al.

ICLR 2025arXiv:2502.13170
6
citations

VITED: Video Temporal Evidence Distillation

Yujie Lu, Yale Song, Lorenzo Torresani et al.

CVPR 2025arXiv:2503.12855
2
citations

VRBench: A Benchmark for Multi-Step Reasoning in Long Narrative Videos

Jiashuo Yu, Yue Wu, Meng Chu et al.

ICCV 2025arXiv:2506.10857
9
citations

WebDancer: Towards Autonomous Information Seeking Agency

Jialong Wu, Baixuan Li, Runnan Fang et al.

NEURIPS 2025arXiv:2505.22648
98
citations

AlphaZero-Like Tree-Search can Guide Large Language Model Decoding and Training

Ziyu Wan, Xidong Feng, Muning Wen et al.

ICML 2024arXiv:2309.17179
304
citations

Understanding Reasoning Ability of Language Models From the Perspective of Reasoning Paths Aggregation

Xinyi Wang, Alfonso Amayuelas, Kexun Zhang et al.

ICML 2024arXiv:2402.03268
25
citations