"sequence modeling" Papers
43 papers found
Conference
Achilles' Heel of Mamba: Essential difficulties of the Mamba architecture demonstrated by synthetic data
Tianyi Chen, Pengxiao Lin, Zhiwei Wang et al.
Adaptive Computation Pruning for the Forgetting Transformer
Zhixuan Lin, Johan Obando-Ceron, Xu Owen He et al.
BlockScan: Detecting Anomalies in Blockchain Transactions
Jiahao Yu, Xian Wu, Hao Liu et al.
Competition Dynamics Shape Algorithmic Phases of In-Context Learning
Core Francisco Park, Ekdeep Singh Lubana, Hidenori Tanaka
Controllable Generation via Locally Constrained Resampling
Kareem Ahmed, Kai-Wei Chang, Guy Van den Broeck
DeltaProduct: Improving State-Tracking in Linear RNNs via Householder Products
Julien Siems, Timur Carstensen, Arber Zela et al.
Drama: Mamba-Enabled Model-Based Reinforcement Learning Is Sample and Parameter Efficient
Wenlong Wang, Ivana Dusparic, Yucheng Shi et al.
EDELINE: Enhancing Memory in Diffusion-based World Models via Linear-Time Sequence Modeling
Jia-Hua Lee, Bor-Jiun Lin, Wei-Fang Sun et al.
Enhancing the Maximum Effective Window for Long-Term Time Series Forecasting
Jiahui Zhang, Zhengyang Zhou, Wenjie Du et al.
Evolutionary Reasoning Does Not Arise in Standard Usage of Protein Language Models
Yasha Ektefaie, Andrew Shen, Lavik Jain et al.
Hankel Singular Value Regularization for Highly Compressible State Space Models
Paul Schwerdtner, Jules Berman, Benjamin Peherstorfer
Improving Bilinear RNN with Closed-loop Control
Jiaxi Hu, Yongqi Pan, Jusen Du et al.
Learning Video-Conditioned Policy on Unlabelled Data with Joint Embedding Predictive Transformer
Hao Luo, Zongqing Lu
Limits of Deep Learning: Sequence Modeling through the Lens of Complexity Theory
Nikola Zubic, Federico Soldà, Aurelio Sulser et al.
Longhorn: State Space Models are Amortized Online Learners
Bo Liu, Rui Wang, Lemeng Wu et al.
MeshAnything V2: Artist-Created Mesh Generation with Adjacent Mesh Tokenization
Yiwen Chen, Yikai Wang, Yihao Luo et al.
Neural Attention Search
Difan Deng, Marius Lindauer
Parallel Sequence Modeling via Generalized Spatial Propagation Network
Hongjun Wang, Wonmin Byeon, Jiarui Xu et al.
Plug, Play, and Generalize: Length Extrapolation with Pointer-Augmented Neural Memory
Svetha Venkatesh, Kien Do, Hung Le et al.
PointRWKV: Efficient RWKV-Like Model for Hierarchical Point Cloud Learning
Qingdong He, Jiangning Zhang, Jinlong Peng et al.
Scaling Up Liquid-Resistance Liquid-Capacitance Networks for Efficient Sequence Modeling
Mónika Farsang, Radu Grosu
SeerAttention: Self-distilled Attention Gating for Efficient Long-context Prefilling
Yizhao Gao, Zhichen Zeng, DaYou Du et al.
Selective induction Heads: How Transformers Select Causal Structures in Context
Francesco D'Angelo, francesco croce, Nicolas Flammarion
State Space Models are Provably Comparable to Transformers in Dynamic Token Selection
Naoki Nishikawa, Taiji Suzuki
Structured Linear CDEs: Maximally Expressive and Parallel-in-Time Sequence Models
Benjamin Walker, Lingyi Yang, Nicola Muca Cirone et al.
Tackling Data Corruption in Offline Reinforcement Learning via Sequence Modeling
Jiawei Xu, Rui Yang, Shuang Qiu et al.
Tensor Product Attention Is All You Need
Yifan Zhang, Yifeng Liu, Huizhuo Yuan et al.
Unsupervised Meta-Learning via In-Context Learning
Anna Vettoruzzo, Lorenzo Braccaioli, Joaquin Vanschoren et al.
What One Cannot, Two Can: Two-Layer Transformers Provably Represent Induction Heads on Any-Order Markov Chains
Chanakya Ekbote, Ashok Vardhan Makkuva, Marco Bondaschi et al.
ZeroS: Zero‑Sum Linear Attention for Efficient Transformers
Jiecheng Lu, Xu Han, Yan Sun et al.
An Information-Theoretic Analysis of In-Context Learning
Hong Jun Jeon, Jason Lee, Qi Lei et al.
Efficient World Models with Context-Aware Tokenization
Vincent Micheli, Eloi Alonso, François Fleuret
From Generalization Analysis to Optimization Designs for State Space Models
Fusheng Liu, Qianxiao Li
How Transformers Learn Causal Structure with Gradient Descent
Eshaan Nichani, Alex Damian, Jason Lee
Imagine, Initialize, and Explore: An Effective Exploration Method in Multi-Agent Reinforcement Learning
Zeyang Liu, Lipeng Wan, Xinrui Yang et al.
Mamba-ND: Selective State Space Modeling for Multi-Dimensional Data
Shufan Li, Aditya Grover, Harkanwar Singh
Reinformer: Max-Return Sequence Modeling for Offline RL
Zifeng Zhuang, Dengyun Peng, Jinxin Liu et al.
Repeat After Me: Transformers are Better than State Space Models at Copying
Samy Jelassi, David Brandfonbrener, Sham Kakade et al.
Rethinking Decision Transformer via Hierarchical Reinforcement Learning
Yi Ma, Jianye Hao, Hebin Liang et al.
Self-Distillation Regularized Connectionist Temporal Classification Loss for Text Recognition: A Simple Yet Effective Approach
Ziyin Zhang, Ning Lu, Minghui Liao et al.
Timer: Generative Pre-trained Transformers Are Large Time Series Models
Yong Liu, Haoran Zhang, Chenyu Li et al.
Universality of Linear Recurrences Followed by Non-linear Projections: Finite-Width Guarantees and Benefits of Complex Eigenvalues
Antonio Orvieto, Soham De, Caglar Gulcehre et al.
Vocabulary for Universal Approximation: A Linguistic Perspective of Mapping Compositions
Yongqiang Cai