"next-token prediction" Papers
18 papers found
Conference
Context-Aware Regularization with Markovian Integration for Attention-Based Nucleotide Analysis
Mohammad Saleh Refahi, Mahdi Abavisani, Bahrad Sokhansanj et al.
Correlation and Navigation in the Vocabulary Key Representation Space of Language Models
Letian Peng, Chenyang An, Jingbo Shang
HoVLE: Unleashing the Power of Monolithic Vision-Language Models with Holistic Vision-Language Embedding
Chenxin Tao, Shiqian Su, Xizhou Zhu et al.
Implicit Search via Discrete Diffusion: A Study on Chess
Jiacheng Ye, Zhenyu Wu, Jiahui Gao et al.
Lines of Thought in Large Language Models
Raphaël Sarfati, Toni Liu, Nicolas Boulle et al.
Long-term Traffic Simulation with Interleaved Autoregressive Motion and Scenario Generation
Xiuyu Yang, Shuhan Tan, Philipp Kraehenbuehl
OmniGen-AR: AutoRegressive Any-to-Image Generation
Junke Wang, Xun Wang, Qiushan Guo et al.
On the Bias of Next-Token Predictors Toward Systematically Inefficient Reasoning: A Shortest-Path Case Study
Riccardo Alberghi, Elizaveta Demyanenko, Luca Biggio et al.
PanoLlama: Generating Endless and Coherent Panoramas with Next-Token-Prediction LLMs
Teng Zhou, Xiaoyu Zhang, Yongchuan Tang
Re-Thinking Inverse Graphics With Large Language Models
Haiwen Feng, Michael J Black, Weiyang Liu et al.
Roll the dice & look before you leap: Going beyond the creative limits of next-token prediction
Vaishnavh Nagarajan, Chen Wu, Charles Ding et al.
Scaling Laws for Gradient Descent and Sign Descent for Linear Bigram Models under Zipf’s Law
Frederik Kunstner, Francis Bach
VladVA: Discriminative Fine-tuning of LVLMs
Yassine Ouali, Adrian Bulat, ALEXANDROS XENOS et al.
Auto-Regressive Next-Token Predictors are Universal Learners
Eran Malach
PIN: Positional Insert Unlocks Object Localisation Abilities in VLMs
Michael Dorkenwald, Nimrod Barazani, Cees G. M. Snoek et al.
Tandem Transformers for Inference Efficient LLMs
Aishwarya P S, Pranav Nair, Yashas Samaga et al.
The Pitfalls of Next-Token Prediction
Gregor Bachmann, Vaishnavh Nagarajan
Understanding Reasoning Ability of Language Models From the Perspective of Reasoning Paths Aggregation
Xinyi Wang, Alfonso Amayuelas, Kexun Zhang et al.