"transformer architecture" Papers

335 papers found • Page 7 of 7

SeTformer Is What You Need for Vision and Language

Pourya Shamsolmoali, Masoumeh Zareapoor, Eric Granger et al.

AAAI 2024paperarXiv:2401.03540
7
citations

Slot Abstractors: Toward Scalable Abstract Visual Reasoning

Shanka Subhra Mondal, Jonathan Cohen, Taylor Webb

ICML 2024arXiv:2403.03458
10
citations

SMFANet: A Lightweight Self-Modulation Feature Aggregation Network for Efficient Image Super-Resolution

mingjun zheng, Long Sun, Jiangxin Dong et al.

ECCV 2024
72
citations

Spherical World-Locking for Audio-Visual Localization in Egocentric Videos

Heeseung Yun, Ruohan Gao, Ishwarya Ananthabhotla et al.

ECCV 2024arXiv:2408.05364
7
citations

SpikeZIP-TF: Conversion is All You Need for Transformer-based SNN

kang you, Zekai Xu, Chen Nie et al.

ICML 2024arXiv:2406.03470
20
citations

Surface-VQMAE: Vector-quantized Masked Auto-encoders on Molecular Surfaces

Fang Wu, Stan Z Li

ICML 2024

Switch Diffusion Transformer: Synergizing Denoising Tasks with Sparse Mixture-of-Experts

Byeongjun Park, Hyojun Go, Jin-Young Kim et al.

ECCV 2024arXiv:2403.09176
23
citations

Text2HOI: Text-guided 3D Motion Generation for Hand-Object Interaction

Junuk Cha, Jihyeon Kim, Jae Shin Yoon et al.

CVPR 2024arXiv:2404.00562
62
citations

Text-Conditioned Resampler For Long Form Video Understanding

Bruno Korbar, Yongqin Xian, Alessio Tonioni et al.

ECCV 2024arXiv:2312.11897
24
citations

The Illusion of State in State-Space Models

William Merrill, Jackson Petty, Ashish Sabharwal

ICML 2024arXiv:2404.08819
128
citations

The Pitfalls of Next-Token Prediction

Gregor Bachmann, Vaishnavh Nagarajan

ICML 2024arXiv:2403.06963
139
citations

Towards Causal Foundation Model: on Duality between Optimal Balancing and Attention

Jiaqi Zhang, Joel Jennings, Agrin Hilmkil et al.

ICML 2024

Towards Efficient Spiking Transformer: a Token Sparsification Framework for Training and Inference Acceleration

Zhengyang Zhuge, Peisong Wang, Xingting Yao et al.

ICML 2024

Towards General Algorithm Discovery for Combinatorial Optimization: Learning Symbolic Branching Policy from Bipartite Graph

Yufei Kuang, Jie Wang, Yuyan Zhou et al.

ICML 2024

Towards Scalable 3D Anomaly Detection and Localization: A Benchmark via 3D Anomaly Synthesis and A Self-Supervised Learning Network

wenqiao Li, Xiaohao Xu, Yao Gu et al.

CVPR 2024arXiv:2311.14897
53
citations

Towards Understanding Inductive Bias in Transformers: A View From Infinity

Itay Lavie, Guy Gur-Ari, Zohar Ringel

ICML 2024arXiv:2402.05173
10
citations

Towards Understanding the Word Sensitivity of Attention Layers: A Study via Random Features

Simone Bombari, Marco Mondelli

ICML 2024arXiv:2402.02969
6
citations

Trainable Transformer in Transformer

Abhishek Panigrahi, Sadhika Malladi, Mengzhou Xia et al.

ICML 2024arXiv:2307.01189
14
citations

Transcending the Limit of Local Window: Advanced Super-Resolution Transformer with Adaptive Token Dictionary

Leheng Zhang, Yawei Li, Xingyu Zhou et al.

CVPR 2024arXiv:2401.08209
76
citations

Transformer-Based No-Reference Image Quality Assessment via Supervised Contrastive Learning

Jinsong Shi, Pan Gao, Jie Qin

AAAI 2024paperarXiv:2312.06995
38
citations

Transformers are SSMs: Generalized Models and Efficient Algorithms Through Structured State Space Duality

Tri Dao, Albert Gu

ICML 2024arXiv:2405.21060
1146
citations

Transformers Learn Nonlinear Features In Context: Nonconvex Mean-field Dynamics on the Attention Landscape

Juno Kim, Taiji Suzuki

ICML 2024arXiv:2402.01258
38
citations

Translation Equivariant Transformer Neural Processes

Matthew Ashman, Cristiana Diaconu, Junhyuck Kim et al.

ICML 2024oralarXiv:2406.12409
10
citations

Transolver: A Fast Transformer Solver for PDEs on General Geometries

Haixu Wu, Huakun Luo, Haowen Wang et al.

ICML 2024spotlightarXiv:2402.02366
181
citations

Triplane Meets Gaussian Splatting: Fast and Generalizable Single-View 3D Reconstruction with Transformers

Zi-Xin Zou, Zhipeng Yu, Yuan-Chen Guo et al.

CVPR 2024arXiv:2312.09147
278
citations

Unveiling Advanced Frequency Disentanglement Paradigm for Low-Light Image Enhancement

Kun Zhou, Xinyu Lin, Wenbo Li et al.

ECCV 2024arXiv:2409.01641
3
citations

Various Lengths, Constant Speed: Efficient Language Modeling with Lightning Attention

Zhen Qin, Weigao Sun, Dong Li et al.

ICML 2024arXiv:2405.17381
24
citations

View-decoupled Transformer for Person Re-identification under Aerial-ground Camera Network

Quan Zhang, Lei Wang, Vishal M. Patel et al.

CVPR 2024arXiv:2403.14513
36
citations

Viewing Transformers Through the Lens of Long Convolutions Layers

Itamar Zimerman, Lior Wolf

ICML 2024

VSFormer: Visual-Spatial Fusion Transformer for Correspondence Pruning

Tangfei Liao, Xiaoqin Zhang, Li Zhao et al.

AAAI 2024paperarXiv:2312.08774
15
citations

Wavelength-Embedding-guided Filter-Array Transformer for Spectral Demosaicing

haijin zeng, Hiep Luong, Wilfried Philips

ECCV 2024
1
citations

What Can Transformer Learn with Varying Depth? Case Studies on Sequence Learning Tasks

Xingwu Chen, Difan Zou

ICML 2024arXiv:2404.01601
20
citations

When Fast Fourier Transform Meets Transformer for Image Restoration

xingyu jiang, Xiuhui Zhang, Ning Gao et al.

ECCV 2024
48
citations

When Linear Attention Meets Autoregressive Decoding: Towards More Effective and Efficient Linearized Large Language Models

Haoran You, Yichao Fu, Zheng Wang et al.

ICML 2024arXiv:2406.07368
9
citations

X4D-SceneFormer: Enhanced Scene Understanding on 4D Point Cloud Videos through Cross-Modal Knowledge Transfer

Linglin Jing, Ying Xue, Xu Yan et al.

AAAI 2024paperarXiv:2312.07378
11
citations
Previous
1...567
Next