"transformer architecture" Papers

335 papers found • Page 2 of 7

Filters:transformer architecture Clear all

Conference

AAAI 2025 (3,028)COLM 2025 (418)CVPR 2025 (2,873)ICCV 2025 (2,701)ICLR 2025 (3,827)ICML 2025 (3,340)ISMAR 2025 (229)NEURIPS 2025 (5,858)AAAI 2024 (2,289)CVPR 2024 (2,716)ECCV 2024 (2,387)ICLR 2024 (2,297)ICML 2024 (2,635)

Paper Type

poster (24,624)paper (8,558)oral (1,594)spotlight (1,421)highlight (975)

EDEN: Enhanced Diffusion for High-quality Large-motion Video Frame Interpolation

Zihao Zhang, Haoran Chen, Haoyu Zhao et al.

CVPR 2025arXiv:2503.15831

citations

Efficient Attention-Sharing Information Distillation Transformer for Lightweight Single Image Super-Resolution

Karam Park, Jae Woong Soh, Nam Ik Cho

AAAI 2025paperarXiv:2501.15774

citations

Efficient Concertormer for Image Deblurring and Beyond

Pin-Hung Kuo, Jinshan Pan, Shao-Yi Chien et al.

ICCV 2025arXiv:2404.06135

Efficient Time Series Processing for Transformers and State-Space Models through Token Merging

Leon Götz, Marcel Kollovieh, Stephan Günnemann et al.

ICML 2025arXiv:2405.17951

citations

Emergence of Linear Truth Encodings in Language Models

Shauli Ravfogel, Gilad Yehudai, Tal Linzen et al.

NEURIPS 2025arXiv:2510.15804

citations

End-to-End HOI Reconstruction Transformer with Graph-based Encoding

Zhenrong Wang, Qi Zheng, Sihan Ma et al.

CVPR 2025highlightarXiv:2503.06012

citations

End-to-End Implicit Neural Representations for Classification

Alexander Gielisse, Jan van Gemert

CVPR 2025arXiv:2503.18123

citations

Enhancing 3D Gaze Estimation in the Wild using Weak Supervision with Gaze Following Labels

Pierre Vuillecard, Jean-marc Odobez

CVPR 2025arXiv:2502.20249

citations

Enhancing Image Restoration Transformer via Adaptive Translation Equivariance

JiaKui Hu, Zhengjian Yao, Lujia Jin et al.

ICCV 2025arXiv:2506.18520

citations

Enhancing Transformers Through Conditioned Embedded Tokens

Hemanth Saratchandran, Simon Lucey

ICCV 2025arXiv:2505.12789

citations

ESCAPE: Equivariant Shape Completion via Anchor Point Encoding

Burak Bekci, Nassir Navab, Federico Tombari et al.

CVPR 2025arXiv:2412.00952

citations

Evolutionary Reasoning Does Not Arise in Standard Usage of Protein Language Models

Yasha Ektefaie, Andrew Shen, Lavik Jain et al.

NEURIPS 2025

Exact Expressive Power of Transformers with Padding

Will Merrill, Ashish Sabharwal

NEURIPS 2025arXiv:2505.18948

citations

FaceXFormer: A Unified Transformer for Facial Analysis

Kartik Narayan, Vibashan VS, Rama Chellappa et al.

ICCV 2025arXiv:2403.12960

citations

FFN Fusion: Rethinking Sequential Computation in Large Language Models

Akhiad Bercovich, Mohammed Dabbah, Omri Puny et al.

NEURIPS 2025spotlightarXiv:2503.18908

citations

First Attentions Last: Better Exploiting First Attentions for Efficient Parallel Training

Gyudong Kim, Hyukju Na, Jin Kim et al.

NEURIPS 2025

Flatten Graphs as Sequences: Transformers are Scalable Graph Generators

Dexiong Chen, Markus Krimmel, Karsten Borgwardt

NEURIPS 2025arXiv:2502.02216

citations

Flow to the Mode: Mode-Seeking Diffusion Autoencoders for State-of-the-Art Image Tokenization

Kyle Sargent, Kyle Hsu, Justin Johnson et al.

ICCV 2025arXiv:2503.11056

citations

FreeSplatter: Pose-free Gaussian Splatting for Sparse-view 3D Reconstruction

Jiale Xu, Shenghua Gao, Ying Shan

ICCV 2025arXiv:2412.09573

citations

From Attention to Activation: Unraveling the Enigmas of Large Language Models

Prannay Kaul, Chengcheng Ma, Ismail Elezi et al.

ICLR 2025arXiv:2410.17174

citations

Gemstones: A Model Suite for Multi-Faceted Scaling Laws

Sean McLeish, John Kirchenbauer, David Miller et al.

NEURIPS 2025arXiv:2502.06857

citations

Generation as Search Operator for Test-Time Scaling of Diffusion-based Combinatorial Optimization

Yang Li, Lvda Chen, Haonan Wang et al.

NEURIPS 2025

GeoAggregator: An Efficient Transformer Model for Geo-Spatial Tabular Data

Rui Deng, Ziqi Li, Mingshu Wang

AAAI 2025paperarXiv:2502.15032

citations

GEOPARD: Geometric Pretraining for Articulation Prediction in 3D Shapes

Pradyumn Goyal, Dmitrii Petrov, Sheldon Andrews et al.

ICCV 2025arXiv:2504.02747

citations

Grammar Reinforcement Learning: path and cycle counting in graphs with a Context-Free Grammar and Transformer approach

Jason Piquenot, Maxime Berar, Romain Raveaux et al.

ICLR 2025

HeatFormer: A Neural Optimizer for Multiview Human Mesh Recovery

Yuto Matsubara, Ko Nishino

CVPR 2025arXiv:2412.04456

citations

HELM: Hyperbolic Large Language Models via Mixture-of-Curvature Experts

Neil He, Rishabh Anand, Hiren Madhu et al.

NEURIPS 2025arXiv:2505.24722

citations

Hierarchical Self-Attention: Generalizing Neural Attention Mechanics to Multi-Scale Problems

Saeed Amizadeh, Sara Abdali, Yinheng Li et al.

NEURIPS 2025arXiv:2509.15448

How Data Mixing Shapes In-Context Learning: Asymptotic Equivalence for Transformers with MLPs

Samet Demir, Zafer Dogan

NEURIPS 2025arXiv:2510.25753

Hyperbolic Genome Embeddings

Raiyan Khan, Philippe Chlenski, Itsik Pe'er

ICLR 2025arXiv:2507.21648

citations

Impact of Layer Norm on Memorization and Generalization in Transformers

Rishi Singhal, Jung-Eun Kim

NEURIPS 2025arXiv:2511.10566

citations

Improving Formal Reasoning of Transformer with State Stack

Kechi Zhang, Ge Li, Jia Li et al.

NEURIPS 2025

Improving Model Representation and Reducing KV Cache via Skip Connections with First Value Heads

Zhoutong Wu, Yuan Zhang, Yiming Dong et al.

NEURIPS 2025arXiv:2510.16807

Improving Transformer Based Line Segment Detection with Matched Predicting and Re-ranking

Xin Tong, Shi Peng, Baojie Tian et al.

AAAI 2025paperarXiv:2502.17766

citations

InterMask: 3D Human Interaction Generation via Collaborative Masked Modeling

Muhammad Gohar Javed, chuan guo, Li Cheng et al.

ICLR 2025oralarXiv:2410.10010

citations

JoVALE: Detecting Human Actions in Video Using Audiovisual and Language Contexts

Taein Son, Soo Won Seo, Jisong Kim et al.

AAAI 2025paperarXiv:2412.13708

citations

Kolmogorov-Arnold Transformer

Xingyi Yang, Xinchao Wang

ICLR 2025arXiv:2409.10594

citations

Lambda-Skip Connections: the architectural component that prevents Rank Collapse

Federico Arangath Joseph, Jerome Sieber, Melanie Zeilinger et al.

ICLR 2025arXiv:2410.10609

citations

Language Models Are Implicitly Continuous

Samuele Marro, Davide Evangelista, X. Huang et al.

ICLR 2025arXiv:2504.03933

citations

Learning Crossmodal Interaction Patterns via Attributed Bipartite Graphs for Single-Cell Omics

Xiaotang Wang, Xuanwei Lin, Yun Zhu et al.

NEURIPS 2025

LEDiT: Your Length-Extrapolatable Diffusion Transformer without Positional Encoding

Shen Zhang, Siyuan Liang, Yaning Tan et al.

NEURIPS 2025arXiv:2503.04344

citations

Length Generalization via Auxiliary Tasks

Pranjal Awasthi, Anupam Gupta, Ravi Kumar

NEURIPS 2025

LightPROF: A Lightweight Reasoning Framework for Large Language Model on Knowledge Graph

Tu Ao, Yanhua Yu, Yuling Wang et al.

AAAI 2025paperarXiv:2504.03137

citations

Limitations of Normalization in Attention

Timur Mudarisov, Mikhail Burtsev, Tatiana Petrova et al.

NEURIPS 2025arXiv:2508.17821

citations

Linear Differential Vision Transformer: Learning Visual Contrasts via Pairwise Differentials

Yifan Pu, Jixuan Ying, Qixiu Li et al.

NEURIPS 2025arXiv:2511.00833

citations

LIRM: Large Inverse Rendering Model for Progressive Reconstruction of Shape, Materials and View-dependent Radiance Fields

Zhengqin Li, Dilin Wang, Ka chen et al.

CVPR 2025arXiv:2504.20026

citations

LMFusion: Adapting Pretrained Language Models for Multimodal Generation

Weijia Shi, Xiaochuang Han, Chunting Zhou et al.

NEURIPS 2025arXiv:2412.15188

citations

Longhorn: State Space Models are Amortized Online Learners

Bo Liu, Rui Wang, Lemeng Wu et al.

ICLR 2025arXiv:2407.14207

citations

LVSM: A Large View Synthesis Model with Minimal 3D Inductive Bias

Haian Jin, Hanwen Jiang, Hao Tan et al.

ICLR 2025arXiv:2410.17242

citations

Memory Efficient Matting with Adaptive Token Routing

Yiheng Lin, Yihan Hu, Chenyi Zhang et al.

AAAI 2025paperarXiv:2412.10702

← Previous

1 2 3 4...7