Reasoning by Superposition: A Theoretical Perspective on Chain of Continuous Thought

32citations

arXiv:2505.12514

citations

#183

in NEURIPS 2025

of 5858 papers

Top Authors

Data Points

Top Authors

Hanlin Zhu Shibo Hao Zhiting Hu Jiantao Jiao Stuart J Russell Yuandong Tian

Abstract

Large Language Models (LLMs) have demonstrated remarkable performance in many applications, including challenging reasoning problems via chain-of-thought (CoT) techniques that generate ``thinking tokens'' before answering the questions. While existing theoretical works demonstrate that CoT with discrete tokens boosts the capability of LLMs, recent work on continuous CoT lacks a theoretical understanding of why it outperforms discrete counterparts in various reasoning tasks, such as directed graph reachability, a fundamental graph reasoning problem that includes many practical domain applications as special cases. In this paper, we prove that a two-layer transformer with $D$ steps of continuous CoT can solve the directed graph reachability problem, where $D$ is the diameter of the graph, while the best known result of constant-depth transformers with discrete CoT requires $O(n^2)$ decoding steps where $n$ is the number of vertices ($DShow more

Citation History

Jan 25, 2026

Jan 27, 2026

Jan 28, 2026

Feb 13, 2026

32+32

Feb 13, 2026