"next-token prediction" Papers

18 papers found

Conference

AAAI 2025 (3,028)COLM 2025 (418)CVPR 2025 (2,873)ICCV 2025 (2,701)ICLR 2025 (3,827)ICML 2025 (3,340)ISMAR 2025 (229)NEURIPS 2025 (5,858)AAAI 2024 (2,289)CVPR 2024 (2,716)ECCV 2024 (2,387)ICLR 2024 (2,297)ICML 2024 (2,635)

Paper Type

poster (24,624)paper (8,558)oral (1,594)spotlight (1,421)highlight (975)

Context-Aware Regularization with Markovian Integration for Attention-Based Nucleotide Analysis

Mohammad Saleh Refahi, Mahdi Abavisani, Bahrad Sokhansanj et al.

NEURIPS 2025arXiv:2507.09378

Correlation and Navigation in the Vocabulary Key Representation Space of Language Models

Letian Peng, Chenyang An, Jingbo Shang

ICLR 2025arXiv:2410.02284

HoVLE: Unleashing the Power of Monolithic Vision-Language Models with Holistic Vision-Language Embedding

Chenxin Tao, Shiqian Su, Xizhou Zhu et al.

CVPR 2025arXiv:2412.16158

citations

Implicit Search via Discrete Diffusion: A Study on Chess

Jiacheng Ye, Zhenyu Wu, Jiahui Gao et al.

ICLR 2025arXiv:2502.19805

citations

Lines of Thought in Large Language Models

Raphaël Sarfati, Toni Liu, Nicolas Boulle et al.

ICLR 2025arXiv:2410.01545

citations

Long-term Traffic Simulation with Interleaved Autoregressive Motion and Scenario Generation

Xiuyu Yang, Shuhan Tan, Philipp Kraehenbuehl

ICCV 2025arXiv:2506.17213

citations

OmniGen-AR: AutoRegressive Any-to-Image Generation

Junke Wang, Xun Wang, Qiushan Guo et al.

NEURIPS 2025

On the Bias of Next-Token Predictors Toward Systematically Inefficient Reasoning: A Shortest-Path Case Study

Riccardo Alberghi, Elizaveta Demyanenko, Luca Biggio et al.

NEURIPS 2025arXiv:2507.05362

citations

PanoLlama: Generating Endless and Coherent Panoramas with Next-Token-Prediction LLMs

Teng Zhou, Xiaoyu Zhang, Yongchuan Tang

ICCV 2025highlightarXiv:2411.15867

citations

Re-Thinking Inverse Graphics With Large Language Models

Haiwen Feng, Michael J Black, Weiyang Liu et al.

ICLR 2025arXiv:2404.15228

citations

Roll the dice & look before you leap: Going beyond the creative limits of next-token prediction

Vaishnavh Nagarajan, Chen Wu, Charles Ding et al.

ICML 2025oralarXiv:2504.15266

citations

Scaling Laws for Gradient Descent and Sign Descent for Linear Bigram Models under Zipf’s Law

Frederik Kunstner, Francis Bach

NEURIPS 2025arXiv:2505.19227

citations

VladVA: Discriminative Fine-tuning of LVLMs

Yassine Ouali, Adrian Bulat, ALEXANDROS XENOS et al.

CVPR 2025arXiv:2412.04378

citations

Auto-Regressive Next-Token Predictors are Universal Learners

Eran Malach

ICML 2024arXiv:2309.06979

citations

PIN: Positional Insert Unlocks Object Localisation Abilities in VLMs

Michael Dorkenwald, Nimrod Barazani, Cees G. M. Snoek et al.

CVPR 2024arXiv:2402.08657

citations

Tandem Transformers for Inference Efficient LLMs

Aishwarya P S, Pranav Nair, Yashas Samaga et al.

ICML 2024arXiv:2402.08644

citations

The Pitfalls of Next-Token Prediction

Gregor Bachmann, Vaishnavh Nagarajan

ICML 2024arXiv:2403.06963

139

citations

Understanding Reasoning Ability of Language Models From the Perspective of Reasoning Paths Aggregation

Xinyi Wang, Alfonso Amayuelas, Kexun Zhang et al.

ICML 2024arXiv:2402.03268

citations