Poster "transformer architecture" Papers

257 papers found • Page 1 of 6

Accelerating Training with Neuron Interaction and Nowcasting Networks

Boris Knyazev, Abhinav Moudgil, Guillaume Lajoie et al.

ICLR 2025arXiv:2409.04434
5
citations

Adaptive Transformer Programs: Bridging the Gap Between Performance and Interpretability in Transformers

Quoc-Vinh Lai-Dang, Taemin Kang, Seungah Son

ICLR 2025

Aligning Moments in Time using Video Queries

Yogesh Kumar, Uday Agarwal, Manish Gupta et al.

ICCV 2025arXiv:2508.15439
1
citations

AniMer: Animal Pose and Shape Estimation Using Family Aware Transformer

Jin Lyu, Tianyi Zhu, Yi Gu et al.

CVPR 2025arXiv:2412.00837
10
citations

An OpenMind for 3D Medical Vision Self-supervised Learning

Tassilo Wald, Constantin Ulrich, Jonathan Suprijadi et al.

ICCV 2025arXiv:2412.17041
12
citations

Ask, and it shall be given: On the Turing completeness of prompting

Ruizhong Qiu, Zhe Xu, Wenxuan Bao et al.

ICLR 2025arXiv:2411.01992
6
citations

A Solvable Attention for Neural Scaling Laws

Bochen Lyu, Di Wang, Zhanxing Zhu

ICLR 2025
5
citations

Attention as a Hypernetwork

Simon Schug, Seijin Kobayashi, Yassir Akram et al.

ICLR 2025arXiv:2406.05816
10
citations

Attention-based clustering

Rodrigo Maulen Soto, Pierre Marion, Claire Boyer

NEURIPS 2025arXiv:2505.13112
1
citations

BlockScan: Detecting Anomalies in Blockchain Transactions

Jiahao Yu, Xian Wu, Hao Liu et al.

NEURIPS 2025arXiv:2410.04039
3
citations

Boltzmann Attention Sampling for Image Analysis with Small Objects

Theodore Zhao, Sid Kiblawi, Mu Wei et al.

CVPR 2025arXiv:2503.02841
2
citations

Born a Transformer -- Always a Transformer? On the Effect of Pretraining on Architectural Abilities

Mayank Jobanputra, Yana Veitsman, Yash Sarrof et al.

NEURIPS 2025arXiv:2505.21785
3
citations

Can Transformers Do Enumerative Geometry?

Baran Hashemi, Roderic Corominas, Alessandro Giacchetto

ICLR 2025arXiv:2408.14915
9
citations

CARP: Visuomotor Policy Learning via Coarse-to-Fine Autoregressive Prediction

Zhefei Gong, Pengxiang Ding, Shangke Lyu et al.

ICCV 2025arXiv:2412.06782
24
citations

CATANet: Efficient Content-Aware Token Aggregation for Lightweight Image Super-Resolution

Xin Liu, Jie Liu, Jie Tang et al.

CVPR 2025arXiv:2503.06896
26
citations

Chaos Meets Attention: Transformers for Large-Scale Dynamical Prediction

Yi He, Yiming Yang, Xiaoyuan Cheng et al.

ICML 2025arXiv:2504.20858
9
citations

CO-MOT: Boosting End-to-end Transformer-based Multi-Object Tracking via Coopetition Label Assignment and Shadow Sets

feng yan, Weixin Luo, Yujie Zhong et al.

ICLR 2025
7
citations

Compositional Reasoning with Transformers, RNNs, and Chain of Thought

Gilad Yehudai, Noah Amsel, Joan Bruna

NEURIPS 2025arXiv:2503.01544
2
citations

ComRoPE: Scalable and Robust Rotary Position Embedding Parameterized by Trainable Commuting Angle Matrices

Hao Yu, Tangyu Jiang, Shuning Jia et al.

CVPR 2025arXiv:2506.03737
4
citations

Context-Aware Regularization with Markovian Integration for Attention-Based Nucleotide Analysis

Mohammad Saleh Refahi, Mahdi Abavisani, Bahrad Sokhansanj et al.

NEURIPS 2025arXiv:2507.09378

Context-Enhanced Memory-Refined Transformer for Online Action Detection

Zhanzhong Pang, Fadime Sener, Angela Yao

CVPR 2025arXiv:2503.18359
5
citations

Continual Multiple Instance Learning with Enhanced Localization for Histopathological Whole Slide Image Analysis

Byung Hyun Lee, Wongi Jeong, Woojae Han et al.

ICCV 2025arXiv:2507.02395
2
citations

Continuity and Isolation Lead to Doubts or Dilemmas in Large Language Models

Hector Pasten, Felipe Urrutia, Hector Orellana et al.

NEURIPS 2025arXiv:2505.10606
1
citations

CoTFormer: A Chain of Thought Driven Architecture with Budget-Adaptive Computation Cost at Inference

Amirkeivan Mohtashami, Matteo Pagliardini, Martin Jaggi

ICLR 2025arXiv:2310.10845
15
citations

CrossSpectra: Exploiting Cross-Layer Smoothness for Parameter-Efficient Fine-Tuning

Yifei Zhang, Hao Zhu, Junhao Dong et al.

NEURIPS 2025

DeltaFormer: Unlock the state space of Transformer

Mingyu Xu, Tenglong Ao, Jiaao He et al.

NEURIPS 2025

Devil is in the Uniformity: Exploring Diverse Learners within Transformer for Image Restoration

Shihao Zhou, Dayu Li, Jinshan Pan et al.

ICCV 2025arXiv:2503.20174
1
citations

DiffE2E: Rethinking End-to-End Driving with a Hybrid Diffusion-Regression-Classification Policy

Rui Zhao, Yuze Fan, Ziguo Chen et al.

NEURIPS 2025

Differential Transformer

Tianzhu Ye, Li Dong, Yuqing Xia et al.

ICLR 2025arXiv:2410.05258

DiffLocks: Generating 3D Hair from a Single Image using Diffusion Models

Radu Alexandru Rosu, Keyu Wu, Yao Feng et al.

CVPR 2025arXiv:2505.06166
7
citations

DIFFSSR: Stereo Image Super-resolution Using Differential Transformer

Dafeng Zhang

NEURIPS 2025

Diffusion-Based Planning for Autonomous Driving with Flexible Guidance

Yinan Zheng, Ruiming Liang, Kexin ZHENG et al.

ICLR 2025arXiv:2501.15564
82
citations

Dimension Agnostic Neural Processes

Hyungi Lee, Chaeyun Jang, Dong Bok Lee et al.

ICLR 2025arXiv:2502.20661
3
citations

Dinomaly: The Less Is More Philosophy in Multi-Class Unsupervised Anomaly Detection

Jia Guo, Shuai Lu, Weihang Zhang et al.

CVPR 2025arXiv:2405.14325
56
citations

Distributional Associations vs In-Context Reasoning: A Study of Feed-forward and Attention Layers

Lei Chen, Joan Bruna, Alberto Bietti

ICLR 2025arXiv:2406.03068
8
citations

Dynamic Diffusion Transformer

Wangbo Zhao, Yizeng Han, Jiasheng Tang et al.

ICLR 2025arXiv:2410.03456
38
citations

Dynamic Semantic-Aware Correlation Modeling for UAV Tracking

Xinyu Zhou, Tongxin Pan, Lingyi Hong et al.

NEURIPS 2025arXiv:2510.21351

Easi3R: Estimating Disentangled Motion from DUSt3R Without Training

Xingyu Chen, Yue Chen, Yuliang Xiu et al.

ICCV 2025arXiv:2503.24391
48
citations

EDEN: Enhanced Diffusion for High-quality Large-motion Video Frame Interpolation

Zihao Zhang, Haoran Chen, Haoyu Zhao et al.

CVPR 2025arXiv:2503.15831
10
citations

Efficient Concertormer for Image Deblurring and Beyond

Pin-Hung Kuo, Jinshan Pan, Shao-Yi Chien et al.

ICCV 2025arXiv:2404.06135

Efficient Time Series Processing for Transformers and State-Space Models through Token Merging

Leon Götz, Marcel Kollovieh, Stephan Günnemann et al.

ICML 2025arXiv:2405.17951
5
citations

Emergence of Linear Truth Encodings in Language Models

Shauli Ravfogel, Gilad Yehudai, Tal Linzen et al.

NEURIPS 2025arXiv:2510.15804
3
citations

End-to-End Implicit Neural Representations for Classification

Alexander Gielisse, Jan van Gemert

CVPR 2025arXiv:2503.18123
4
citations

Enhancing 3D Gaze Estimation in the Wild using Weak Supervision with Gaze Following Labels

Pierre Vuillecard, Jean-marc Odobez

CVPR 2025arXiv:2502.20249
7
citations

Enhancing Image Restoration Transformer via Adaptive Translation Equivariance

JiaKui Hu, Zhengjian Yao, Lujia Jin et al.

ICCV 2025arXiv:2506.18520
3
citations

Enhancing Transformers Through Conditioned Embedded Tokens

Hemanth Saratchandran, Simon Lucey

ICCV 2025arXiv:2505.12789
2
citations

ESCAPE: Equivariant Shape Completion via Anchor Point Encoding

Burak Bekci, Nassir Navab, Federico Tombari et al.

CVPR 2025arXiv:2412.00952
4
citations

Evolutionary Reasoning Does Not Arise in Standard Usage of Protein Language Models

Yasha Ektefaie, Andrew Shen, Lavik Jain et al.

NEURIPS 2025

Exact Expressive Power of Transformers with Padding

Will Merrill, Ashish Sabharwal

NEURIPS 2025arXiv:2505.18948
7
citations

FaceXFormer: A Unified Transformer for Facial Analysis

Kartik Narayan, Vibashan VS, Rama Chellappa et al.

ICCV 2025arXiv:2403.12960
37
citations
Previous
123...6
Next