"transformer architecture" Papers

335 papers found • Page 1 of 7

4DGT: Learning a 4D Gaussian Transformer Using Real-World Monocular Videos

Zhen Xu, Zhengqin Li, Zhao Dong et al.

NEURIPS 2025spotlightarXiv:2506.08015
15
citations

A²RNet: Adversarial Attack Resilient Network for Robust Infrared and Visible Image Fusion

Jiawei Li, Hongwei Yu, Jiansheng Chen et al.

AAAI 2025paperarXiv:2412.09954
3
citations

Absence Bench: Language Models Can’t See What’s Missing

Harvey Yiyun Fu, Aryan Shrivastava, Jared Moore et al.

NEURIPS 2025spotlight

Accelerating Training with Neuron Interaction and Nowcasting Networks

Boris Knyazev, Abhinav Moudgil, Guillaume Lajoie et al.

ICLR 2025arXiv:2409.04434
5
citations

Achilles' Heel of Mamba: Essential difficulties of the Mamba architecture demonstrated by synthetic data

Tianyi Chen, Pengxiao Lin, Zhiwei Wang et al.

NEURIPS 2025spotlightarXiv:2509.17514
1
citations

Adaptive Transformer Programs: Bridging the Gap Between Performance and Interpretability in Transformers

Quoc-Vinh Lai-Dang, Taemin Kang, Seungah Son

ICLR 2025

Aligning Moments in Time using Video Queries

Yogesh Kumar, Uday Agarwal, Manish Gupta et al.

ICCV 2025arXiv:2508.15439
1
citations

ALINE: Joint Amortization for Bayesian Inference and Active Data Acquisition

Daolang Huang, Xinyi Wen, Ayush Bharti et al.

NEURIPS 2025spotlightarXiv:2506.07259
2
citations

AniMer: Animal Pose and Shape Estimation Using Family Aware Transformer

Jin Lyu, Tianyi Zhu, Yi Gu et al.

CVPR 2025arXiv:2412.00837
10
citations

An OpenMind for 3D Medical Vision Self-supervised Learning

Tassilo Wald, Constantin Ulrich, Jonathan Suprijadi et al.

ICCV 2025arXiv:2412.17041
12
citations

Ask, and it shall be given: On the Turing completeness of prompting

Ruizhong Qiu, Zhe Xu, Wenxuan Bao et al.

ICLR 2025arXiv:2411.01992
6
citations

A Solvable Attention for Neural Scaling Laws

Bochen Lyu, Di Wang, Zhanxing Zhu

ICLR 2025
5
citations

Attention as a Hypernetwork

Simon Schug, Seijin Kobayashi, Yassir Akram et al.

ICLR 2025arXiv:2406.05816
10
citations

Attention-based clustering

Rodrigo Maulen Soto, Pierre Marion, Claire Boyer

NEURIPS 2025arXiv:2505.13112
1
citations

BlockScan: Detecting Anomalies in Blockchain Transactions

Jiahao Yu, Xian Wu, Hao Liu et al.

NEURIPS 2025arXiv:2410.04039
3
citations

Boltzmann Attention Sampling for Image Analysis with Small Objects

Theodore Zhao, Sid Kiblawi, Mu Wei et al.

CVPR 2025arXiv:2503.02841
2
citations

Born a Transformer -- Always a Transformer? On the Effect of Pretraining on Architectural Abilities

Mayank Jobanputra, Yana Veitsman, Yash Sarrof et al.

NEURIPS 2025arXiv:2505.21785
3
citations

Can Transformers Do Enumerative Geometry?

Baran Hashemi, Roderic Corominas, Alessandro Giacchetto

ICLR 2025arXiv:2408.14915
9
citations

CARP: Visuomotor Policy Learning via Coarse-to-Fine Autoregressive Prediction

Zhefei Gong, Pengxiang Ding, Shangke Lyu et al.

ICCV 2025arXiv:2412.06782
24
citations

CATANet: Efficient Content-Aware Token Aggregation for Lightweight Image Super-Resolution

Xin Liu, Jie Liu, Jie Tang et al.

CVPR 2025arXiv:2503.06896
26
citations

Chaos Meets Attention: Transformers for Large-Scale Dynamical Prediction

Yi He, Yiming Yang, Xiaoyuan Cheng et al.

ICML 2025arXiv:2504.20858
9
citations

CO-MOT: Boosting End-to-end Transformer-based Multi-Object Tracking via Coopetition Label Assignment and Shadow Sets

feng yan, Weixin Luo, Yujie Zhong et al.

ICLR 2025
7
citations

Compositional Reasoning with Transformers, RNNs, and Chain of Thought

Gilad Yehudai, Noah Amsel, Joan Bruna

NEURIPS 2025arXiv:2503.01544
2
citations

ComRoPE: Scalable and Robust Rotary Position Embedding Parameterized by Trainable Commuting Angle Matrices

Hao Yu, Tangyu Jiang, Shuning Jia et al.

CVPR 2025arXiv:2506.03737
4
citations

ConSense: Continually Sensing Human Activity with WiFi via Growing and Picking

Rong Li, Tao Deng, Siwei Feng et al.

AAAI 2025paperarXiv:2502.17483
4
citations

Context-Aware Regularization with Markovian Integration for Attention-Based Nucleotide Analysis

Mohammad Saleh Refahi, Mahdi Abavisani, Bahrad Sokhansanj et al.

NEURIPS 2025arXiv:2507.09378

Context-Enhanced Memory-Refined Transformer for Online Action Detection

Zhanzhong Pang, Fadime Sener, Angela Yao

CVPR 2025arXiv:2503.18359
5
citations

Continual Multiple Instance Learning with Enhanced Localization for Histopathological Whole Slide Image Analysis

Byung Hyun Lee, Wongi Jeong, Woojae Han et al.

ICCV 2025arXiv:2507.02395
2
citations

Continuity and Isolation Lead to Doubts or Dilemmas in Large Language Models

Hector Pasten, Felipe Urrutia, Hector Orellana et al.

NEURIPS 2025arXiv:2505.10606
1
citations

CoST: Efficient Collaborative Perception From Unified Spatiotemporal Perspective

Zongheng Tang, Yi Liu, Yifan Sun et al.

ICCV 2025highlightarXiv:2508.00359
2
citations

CoTFormer: A Chain of Thought Driven Architecture with Budget-Adaptive Computation Cost at Inference

Amirkeivan Mohtashami, Matteo Pagliardini, Martin Jaggi

ICLR 2025arXiv:2310.10845
15
citations

CrossSpectra: Exploiting Cross-Layer Smoothness for Parameter-Efficient Fine-Tuning

Yifei Zhang, Hao Zhu, Junhao Dong et al.

NEURIPS 2025

Cubify Anything: Scaling Indoor 3D Object Detection

Justin Lazarow, David Griffiths, Gefen Kohavi et al.

CVPR 2025highlightarXiv:2412.04458
18
citations

DAPoinTr: Domain Adaptive Point Transformer for Point Cloud Completion

Yinghui Li, Qianyu Zhou, Jingyu Gong et al.

AAAI 2025paperarXiv:2412.19062
1
citations

DAWP: A framework for global observation forecasting via Data Assimilation and Weather Prediction in satellite observation space

Junchao Gong, Jingyi Xu, Ben Fei et al.

NEURIPS 2025oralarXiv:2510.15978

DeltaFormer: Unlock the state space of Transformer

Mingyu Xu, Tenglong Ao, Jiaao He et al.

NEURIPS 2025

Depth-Width Tradeoffs for Transformers on Graph Tasks

Gilad Yehudai, Clayton Sanford, Maya Bechler-Speicher et al.

NEURIPS 2025spotlight

Devil is in the Uniformity: Exploring Diverse Learners within Transformer for Image Restoration

Shihao Zhou, Dayu Li, Jinshan Pan et al.

ICCV 2025arXiv:2503.20174
1
citations

DiffE2E: Rethinking End-to-End Driving with a Hybrid Diffusion-Regression-Classification Policy

Rui Zhao, Yuze Fan, Ziguo Chen et al.

NEURIPS 2025

Differential Transformer

Tianzhu Ye, Li Dong, Yuqing Xia et al.

ICLR 2025arXiv:2410.05258

DiffLocks: Generating 3D Hair from a Single Image using Diffusion Models

Radu Alexandru Rosu, Keyu Wu, Yao Feng et al.

CVPR 2025arXiv:2505.06166
7
citations

DIFFSSR: Stereo Image Super-resolution Using Differential Transformer

Dafeng Zhang

NEURIPS 2025

Diffusion-Based Planning for Autonomous Driving with Flexible Guidance

Yinan Zheng, Ruiming Liang, Kexin ZHENG et al.

ICLR 2025arXiv:2501.15564
82
citations

Dimension Agnostic Neural Processes

Hyungi Lee, Chaeyun Jang, Dong Bok Lee et al.

ICLR 2025arXiv:2502.20661
3
citations

Dinomaly: The Less Is More Philosophy in Multi-Class Unsupervised Anomaly Detection

Jia Guo, Shuai Lu, Weihang Zhang et al.

CVPR 2025arXiv:2405.14325
56
citations

Distributional Associations vs In-Context Reasoning: A Study of Feed-forward and Attention Layers

Lei Chen, Joan Bruna, Alberto Bietti

ICLR 2025arXiv:2406.03068
8
citations

Dual Conditioned Motion Diffusion for Pose-Based Video Anomaly Detection

Hongsong Wang, Andi Xu, Pinle Ding et al.

AAAI 2025paperarXiv:2412.17210
6
citations

Dynamic Diffusion Transformer

Wangbo Zhao, Yizeng Han, Jiasheng Tang et al.

ICLR 2025arXiv:2410.03456
38
citations

Dynamic Semantic-Aware Correlation Modeling for UAV Tracking

Xinyu Zhou, Tongxin Pan, Lingyi Hong et al.

NEURIPS 2025arXiv:2510.21351

Easi3R: Estimating Disentangled Motion from DUSt3R Without Training

Xingyu Chen, Yue Chen, Yuliang Xiu et al.

ICCV 2025arXiv:2503.24391
48
citations
Previous
123...7
Next