"transformer architecture" Papers

335 papers found • Page 6 of 7

Improving Transformers with Dynamically Composable Multi-Head Attention

Da Xiao, Qingye Meng, Shengping Li et al.

ICML 2024arXiv:2405.08553
6
citations

In-Context Freeze-Thaw Bayesian Optimization for Hyperparameter Optimization

Herilalaina Rakotoarison, Steven Adriaensen, Neeratyoy Mallik et al.

ICML 2024arXiv:2404.16795
27
citations

In-Context Language Learning: Architectures and Algorithms

Ekin Akyürek, Bailin Wang, Yoon Kim et al.

ICML 2024arXiv:2401.12973
83
citations

In-context Learning on Function Classes Unveiled for Transformers

Zhijie Wang, Bo Jiang, Shuai Li

ICML 2024

InsMapper: Exploring Inner-instance Information for Vectorized HD Mapping

Zhenhua Xu, Kwan-Yee K. Wong, Hengshuang ZHAO

ECCV 2024arXiv:2308.08543
18
citations

I/O Complexity of Attention, or How Optimal is FlashAttention?

Barna Saha, Christopher Ye

ICML 2024

Jointly Modeling Spatio-Temporal Features of Tactile Signals for Action Classification

Jimmy Lin, Junkai Li, Jiasi Gao et al.

AAAI 2024paperarXiv:2404.15279
2
citations

KnowFormer: Revisiting Transformers for Knowledge Graph Reasoning

Junnan Liu, Qianren Mao, Weifeng Jiang et al.

ICML 2024arXiv:2409.12865
5
citations

LaMI-DETR: Open-Vocabulary Detection with Language Model Instruction

Penghui Du, Yu Wang, Yifan Sun et al.

ECCV 2024arXiv:2407.11335
16
citations

Language-conditioned Detection Transformer

Jang Hyun Cho, Philipp Krähenbühl

CVPR 2024arXiv:2311.17902
6
citations

Laplacian-guided Entropy Model in Neural Codec with Blur-dissipated Synthesis

Atefeh Khoshkhahtinat, Ali Zafari, Piyush Mehta et al.

CVPR 2024arXiv:2403.16258
6
citations

Learning Instance-Aware Correspondences for Robust Multi-Instance Point Cloud Registration in Cluttered Scenes

Zhiyuan Yu, Zheng Qin, lintao zheng et al.

CVPR 2024arXiv:2404.04557
16
citations

Learning Natural Consistency Representation for Face Forgery Video Detection

Daichi Zhang, Zihao Xiao, Shikun Li et al.

ECCV 2024arXiv:2407.10550
29
citations

Learning Solution-Aware Transformers for Efficiently Solving Quadratic Assignment Problem

Zhentao Tan, Yadong Mu

ICML 2024arXiv:2406.09899
4
citations

Longitudinal Targeted Minimum Loss-based Estimation with Temporal-Difference Heterogeneous Transformer

Toru Shirakawa, Yi Li, Yulun Wu et al.

ICML 2024oralarXiv:2404.04399
7
citations

LoRAP: Transformer Sub-Layers Deserve Differentiated Structured Compression for Large Language Models

guangyan li, Yongqiang Tang, Wensheng Zhang

ICML 2024arXiv:2404.09695
8
citations

Mastering Symbolic Operations: Augmenting Language Models with Compiled Neural Networks

Yixuan Weng, Minjun Zhu, Fei Xia et al.

ICLR 2024arXiv:2304.01665
12
citations

MASTER: Market-Guided Stock Transformer for Stock Price Forecasting

Tong Li, Zhaoyang Liu, Yanyan Shen et al.

AAAI 2024paperarXiv:2312.15235
59
citations

Merging Multi-Task Models via Weight-Ensembling Mixture of Experts

Anke Tang, Li Shen, Yong Luo et al.

ICML 2024arXiv:2402.00433
84
citations

Meta Evidential Transformer for Few-Shot Open-Set Recognition

Hitesh Sapkota, Krishna Neupane, Qi Yu

ICML 2024

MFTN: A Multi-scale Feature Transfer Network Based on IMatchFormer for Hyperspectral Image Super-Resolution

Shuying Huang, Mingyang Ren, Yong Yang et al.

ICML 2024

Modeling Language Tokens as Functionals of Semantic Fields

Zhengqi Pei, Anran Zhang, Shuhui Wang et al.

ICML 2024

MS-TIP: Imputation Aware Pedestrian Trajectory Prediction

Pranav Singh Chib, Achintya Nath, Paritosh Kabra et al.

ICML 2024

Multi-Agent Reinforcement Learning with Hierarchical Coordination for Emergency Responder Stationing

Amutheezan Sivagnanam, Ava Pettet, Hunter Lee et al.

ICML 2024arXiv:2405.13205
7
citations

Multimodal Pathway: Improve Transformers with Irrelevant Data from Other Modalities

Yiyuan Zhang, Xiaohan Ding, Kaixiong Gong et al.

CVPR 2024arXiv:2401.14405
12
citations

Multiscale Vision Transformers Meet Bipartite Matching for Efficient Single-stage Action Localization

Ioanna Ntinou, Enrique Sanchez, Georgios Tzimiropoulos

CVPR 2024arXiv:2312.17686
7
citations

Multi-Sentence Grounding for Long-term Instructional Video

Zeqian Li, QIRUI CHEN, Tengda Han et al.

ECCV 2024arXiv:2312.14055
12
citations

Neural Reasoning about Agents’ Goals, Preferences, and Actions

Matteo Bortoletto, Lei Shi, Andreas Bulling

AAAI 2024paperarXiv:2312.07122
8
citations

No More Shortcuts: Realizing the Potential of Temporal Self-Supervision

Ishan Rajendrakumar Dave, Simon Jenni, Mubarak Shah

AAAI 2024paperarXiv:2312.13008
12
citations

OAT: Object-Level Attention Transformer for Gaze Scanpath Prediction

Yini Fang, Jingling Yu, Haozheng Zhang et al.

ECCV 2024arXiv:2407.13335
2
citations

ODIN: A Single Model for 2D and 3D Segmentation

Ayush Jain, Pushkal Katara, Nikolaos Gkanatsios et al.

CVPR 2024highlightarXiv:2401.02416
20
citations

Omni-Recon: Harnessing Image-based Rendering for General-Purpose Neural Radiance Fields

Yonggan Fu, Huaizhi Qu, Zhifan Ye et al.

ECCV 2024arXiv:2403.11131

Photorealistic Video Generation with Diffusion Models

Agrim Gupta, Lijun Yu, Kihyuk Sohn et al.

ECCV 2024arXiv:2312.06662
278
citations

PIDformer: Transformer Meets Control Theory

Tam Nguyen, Cesar Uribe, Tan Nguyen et al.

ICML 2024arXiv:2402.15989
12
citations

Polynomial-based Self-Attention for Table Representation Learning

Jayoung Kim, Yehjin Shin, Jeongwhan Choi et al.

ICML 2024arXiv:2312.07753
3
citations

Positional Knowledge is All You Need: Position-induced Transformer (PiT) for Operator Learning

Junfeng CHEN, Kailiang Wu

ICML 2024arXiv:2405.09285
14
citations

Position: Do pretrained Transformers Learn In-Context by Gradient Descent?

Lingfeng Shen, Aayush Mishra, Daniel Khashabi

ICML 2024

Position: Stop Making Unscientific AGI Performance Claims

Patrick Altmeyer, Andrew Demetriou, Antony Bartlett et al.

ICML 2024arXiv:2402.03962
9
citations

Progressive Pretext Task Learning for Human Trajectory Prediction

Xiaotong Lin, Tianming Liang, Jian-Huang Lai et al.

ECCV 2024arXiv:2407.11588
26
citations

Prompting a Pretrained Transformer Can Be a Universal Approximator

Aleksandar Petrov, Phil Torr, Adel Bibi

ICML 2024arXiv:2402.14753
19
citations

Prototypical Transformer As Unified Motion Learners

Cheng Han, Yawen Lu, Guohao Sun et al.

ICML 2024arXiv:2406.01559
9
citations

Recurrent Early Exits for Federated Learning with Heterogeneous Clients

Royson Lee, Javier Fernandez-Marques, Xu Hu et al.

ICML 2024arXiv:2405.14791
13
citations

Relation DETR: Exploring Explicit Position Relation Prior for Object Detection

Xiuquan Hou, Meiqin Liu, Senlin Zhang et al.

ECCV 2024arXiv:2407.11699
64
citations

Repeat After Me: Transformers are Better than State Space Models at Copying

Samy Jelassi, David Brandfonbrener, Sham Kakade et al.

ICML 2024arXiv:2402.01032
162
citations

Rethinking Decision Transformer via Hierarchical Reinforcement Learning

Yi Ma, Jianye Hao, Hebin Liang et al.

ICML 2024arXiv:2311.00267
14
citations

Rethinking Transformers in Solving POMDPs

Chenhao Lu, Ruizhe Shi, Yuyao Liu et al.

ICML 2024arXiv:2405.17358
9
citations

SAMformer: Unlocking the Potential of Transformers in Time Series Forecasting with Sharpness-Aware Minimization and Channel-Wise Attention

Romain Ilbert, Ambroise Odonnat, Vasilii Feofanov et al.

ICML 2024arXiv:2402.10198
56
citations

Scalable High-Resolution Pixel-Space Image Synthesis with Hourglass Diffusion Transformers

Katherine Crowson, Stefan Baumann, Alex Birch et al.

ICML 2024arXiv:2401.11605
84
citations

Self-Attention through Kernel-Eigen Pair Sparse Variational Gaussian Processes

Yingyi Chen, Qinghua Tao, Francesco Tonin et al.

ICML 2024arXiv:2402.01476
3
citations

SelfPromer: Self-Prompt Dehazing Transformers with Depth-Consistency

8137 Feiyu Zhu, Reid Simmons

AAAI 2024paperarXiv:2303.07033
57
citations