"transformer architecture" Papers
335 papers found • Page 2 of 7
Conference
EDEN: Enhanced Diffusion for High-quality Large-motion Video Frame Interpolation
Zihao Zhang, Haoran Chen, Haoyu Zhao et al.
Efficient Attention-Sharing Information Distillation Transformer for Lightweight Single Image Super-Resolution
Karam Park, Jae Woong Soh, Nam Ik Cho
Efficient Concertormer for Image Deblurring and Beyond
Pin-Hung Kuo, Jinshan Pan, Shao-Yi Chien et al.
Efficient Time Series Processing for Transformers and State-Space Models through Token Merging
Leon Götz, Marcel Kollovieh, Stephan Günnemann et al.
Emergence of Linear Truth Encodings in Language Models
Shauli Ravfogel, Gilad Yehudai, Tal Linzen et al.
End-to-End HOI Reconstruction Transformer with Graph-based Encoding
Zhenrong Wang, Qi Zheng, Sihan Ma et al.
End-to-End Implicit Neural Representations for Classification
Alexander Gielisse, Jan van Gemert
Enhancing 3D Gaze Estimation in the Wild using Weak Supervision with Gaze Following Labels
Pierre Vuillecard, Jean-marc Odobez
Enhancing Image Restoration Transformer via Adaptive Translation Equivariance
JiaKui Hu, Zhengjian Yao, Lujia Jin et al.
Enhancing Transformers Through Conditioned Embedded Tokens
Hemanth Saratchandran, Simon Lucey
ESCAPE: Equivariant Shape Completion via Anchor Point Encoding
Burak Bekci, Nassir Navab, Federico Tombari et al.
Evolutionary Reasoning Does Not Arise in Standard Usage of Protein Language Models
Yasha Ektefaie, Andrew Shen, Lavik Jain et al.
Exact Expressive Power of Transformers with Padding
Will Merrill, Ashish Sabharwal
FaceXFormer: A Unified Transformer for Facial Analysis
Kartik Narayan, Vibashan VS, Rama Chellappa et al.
FFN Fusion: Rethinking Sequential Computation in Large Language Models
Akhiad Bercovich, Mohammed Dabbah, Omri Puny et al.
First Attentions Last: Better Exploiting First Attentions for Efficient Parallel Training
Gyudong Kim, Hyukju Na, Jin Kim et al.
Flatten Graphs as Sequences: Transformers are Scalable Graph Generators
Dexiong Chen, Markus Krimmel, Karsten Borgwardt
Flow to the Mode: Mode-Seeking Diffusion Autoencoders for State-of-the-Art Image Tokenization
Kyle Sargent, Kyle Hsu, Justin Johnson et al.
FreeSplatter: Pose-free Gaussian Splatting for Sparse-view 3D Reconstruction
Jiale Xu, Shenghua Gao, Ying Shan
From Attention to Activation: Unraveling the Enigmas of Large Language Models
Prannay Kaul, Chengcheng Ma, Ismail Elezi et al.
Gemstones: A Model Suite for Multi-Faceted Scaling Laws
Sean McLeish, John Kirchenbauer, David Miller et al.
Generation as Search Operator for Test-Time Scaling of Diffusion-based Combinatorial Optimization
Yang Li, Lvda Chen, Haonan Wang et al.
GeoAggregator: An Efficient Transformer Model for Geo-Spatial Tabular Data
Rui Deng, Ziqi Li, Mingshu Wang
GEOPARD: Geometric Pretraining for Articulation Prediction in 3D Shapes
Pradyumn Goyal, Dmitrii Petrov, Sheldon Andrews et al.
Grammar Reinforcement Learning: path and cycle counting in graphs with a Context-Free Grammar and Transformer approach
Jason Piquenot, Maxime Berar, Romain Raveaux et al.
HeatFormer: A Neural Optimizer for Multiview Human Mesh Recovery
Yuto Matsubara, Ko Nishino
HELM: Hyperbolic Large Language Models via Mixture-of-Curvature Experts
Neil He, Rishabh Anand, Hiren Madhu et al.
Hierarchical Self-Attention: Generalizing Neural Attention Mechanics to Multi-Scale Problems
Saeed Amizadeh, Sara Abdali, Yinheng Li et al.
How Data Mixing Shapes In-Context Learning: Asymptotic Equivalence for Transformers with MLPs
Samet Demir, Zafer Dogan
Hyperbolic Genome Embeddings
Raiyan Khan, Philippe Chlenski, Itsik Pe'er
Impact of Layer Norm on Memorization and Generalization in Transformers
Rishi Singhal, Jung-Eun Kim
Improving Formal Reasoning of Transformer with State Stack
Kechi Zhang, Ge Li, Jia Li et al.
Improving Model Representation and Reducing KV Cache via Skip Connections with First Value Heads
Zhoutong Wu, Yuan Zhang, Yiming Dong et al.
Improving Transformer Based Line Segment Detection with Matched Predicting and Re-ranking
Xin Tong, Shi Peng, Baojie Tian et al.
InterMask: 3D Human Interaction Generation via Collaborative Masked Modeling
Muhammad Gohar Javed, chuan guo, Li Cheng et al.
JoVALE: Detecting Human Actions in Video Using Audiovisual and Language Contexts
Taein Son, Soo Won Seo, Jisong Kim et al.
Kolmogorov-Arnold Transformer
Xingyi Yang, Xinchao Wang
Lambda-Skip Connections: the architectural component that prevents Rank Collapse
Federico Arangath Joseph, Jerome Sieber, Melanie Zeilinger et al.
Language Models Are Implicitly Continuous
Samuele Marro, Davide Evangelista, X. Huang et al.
Learning Crossmodal Interaction Patterns via Attributed Bipartite Graphs for Single-Cell Omics
Xiaotang Wang, Xuanwei Lin, Yun Zhu et al.
LEDiT: Your Length-Extrapolatable Diffusion Transformer without Positional Encoding
Shen Zhang, Siyuan Liang, Yaning Tan et al.
Length Generalization via Auxiliary Tasks
Pranjal Awasthi, Anupam Gupta, Ravi Kumar
LightPROF: A Lightweight Reasoning Framework for Large Language Model on Knowledge Graph
Tu Ao, Yanhua Yu, Yuling Wang et al.
Limitations of Normalization in Attention
Timur Mudarisov, Mikhail Burtsev, Tatiana Petrova et al.
Linear Differential Vision Transformer: Learning Visual Contrasts via Pairwise Differentials
Yifan Pu, Jixuan Ying, Qixiu Li et al.
LIRM: Large Inverse Rendering Model for Progressive Reconstruction of Shape, Materials and View-dependent Radiance Fields
Zhengqin Li, Dilin Wang, Ka chen et al.
LMFusion: Adapting Pretrained Language Models for Multimodal Generation
Weijia Shi, Xiaochuang Han, Chunting Zhou et al.
Longhorn: State Space Models are Amortized Online Learners
Bo Liu, Rui Wang, Lemeng Wu et al.
LVSM: A Large View Synthesis Model with Minimal 3D Inductive Bias
Haian Jin, Hanwen Jiang, Hao Tan et al.
Memory Efficient Matting with Adaptive Token Routing
Yiheng Lin, Yihan Hu, Chenyi Zhang et al.