"self-attention mechanism" Papers

37 papers found

Bipolar Self-attention for Spiking Transformers

Shuai Wang, Malu Zhang, Jingya Wang et al.

NEURIPS 2025spotlight

Breaking the Gradient Barrier: Unveiling Large Language Models for Strategic Classification

Xinpeng Lv, Yunxin Mao, Haoxuan Li et al.

NEURIPS 2025arXiv:2511.06979
1
citations

Colors See Colors Ignore: Clothes Changing ReID with Color Disentanglement

Priyank Pathak, Yogesh Rawat

ICCV 2025arXiv:2507.07230
3
citations

Decomposing and Fusing Intra- and Inter-Sensor Spatio-Temporal Signal for Multi-Sensor Wearable Human Activity Recognition

Haoyu Xie, Haoxuan Li, Chunyuan Zheng et al.

AAAI 2025paperarXiv:2501.10917
1
citations

Efficient Concertormer for Image Deblurring and Beyond

Pin-Hung Kuo, Jinshan Pan, Shao-Yi Chien et al.

ICCV 2025arXiv:2404.06135

Emulating Self-attention with Convolution for Efficient Image Super-Resolution

Dongheon Lee, Seokju Yun, Youngmin Ro

ICCV 2025highlightarXiv:2503.06671
6
citations

Hydra-SGG: Hybrid Relation Assignment for One-stage Scene Graph Generation

Minghan Chen, Guikun Chen, Wenguan Wang et al.

ICLR 2025arXiv:2409.10262
10
citations

IFORMER: INTEGRATING CONVNET AND TRANSFORMER FOR MOBILE APPLICATION

Chuanyang Zheng

ICLR 2025arXiv:2501.15369
5
citations

Move and Act: Enhanced Object Manipulation and Background Integrity for Image Editing

Pengfei Jiang, Mingbao Lin, Fei Chao

AAAI 2025paperarXiv:2407.17847
3
citations

Multifaceted User Modeling in Recommendation: A Federated Foundation Models Approach

Chunxu Zhang, Guodong Long, Hongkuan Guo et al.

AAAI 2025paperarXiv:2412.16969
6
citations

On the Role of Hidden States of Modern Hopfield Network in Transformer

NEURIPS 2025arXiv:2511.20698

Quantized Spike-driven Transformer

Xuerui Qiu, Malu Zhang, Jieyuan Zhang et al.

ICLR 2025arXiv:2501.13492
14
citations

Self-Attention-Based Contextual Modulation Improves Neural System Identification

Isaac Lin, Tianye Wang, Shang Gao et al.

ICLR 2025arXiv:2406.07843
2
citations

Self-attention-based Diffusion Model for Time-series Imputation in Partial Blackout Scenarios

Mohammad Rafid Ul Islam, Prasad Tadepalli, Alan Fern

AAAI 2025paperarXiv:2503.01737
9
citations

Self-Attentive Spatio-Temporal Calibration for Precise Intermediate Layer Matching in ANN-to-SNN Distillation

Di Hong, Yueming Wang

AAAI 2025paperarXiv:2501.08049
1
citations

Sim-DETR: Unlock DETR for Temporal Sentence Grounding

Jiajin Tang, Zhengxuan Wei, Yuchen Zhu et al.

ICCV 2025arXiv:2509.23867
2
citations

Spiking Transformer with Spatial-Temporal Attention

Donghyun Lee, Yuhang Li, Youngeun Kim et al.

CVPR 2025arXiv:2409.19764
18
citations

Structured Temporal Causality for Interpretable Multivariate Time Series Anomaly Detection

Dongchan Cho, Jiho Han, Keumyeong Kang et al.

NEURIPS 2025oralarXiv:2510.16511
1
citations

SynCL: A Synergistic Training Strategy with Instance-Aware Contrastive Learning for End-to-End Multi-Camera 3D Tracking

Shubo Lin, Yutong Kou, Zirui Wu et al.

NEURIPS 2025arXiv:2411.06780

Systematic Outliers in Large Language Models

Yongqi An, Xu Zhao, Tao Yu et al.

ICLR 2025arXiv:2502.06415
19
citations

VarDrop: Enhancing Training Efficiency by Reducing Variate Redundancy in Periodic Time Series Forecasting

Junhyeok Kang, Yooju Shin, Jae-Gil Lee

AAAI 2025paperarXiv:2501.14183
3
citations

An Attentive Inductive Bias for Sequential Recommendation beyond the Self-Attention

Yehjin Shin, Jeongwhan Choi, Hyowon Wi et al.

AAAI 2024paperarXiv:2312.10325
104
citations

EFormer: Enhanced Transformer towards Semantic-Contour Features of Foreground for Portraits Matting

Zitao Wang, Qiguang Miao, Yue Xi et al.

CVPR 2024arXiv:2308.12831
5
citations

Fine-grained Local Sensitivity Analysis of Standard Dot-Product Self-Attention

Aaron Havens, Alexandre Araujo, Huan Zhang et al.

ICML 2024

From Self-Attention to Markov Models: Unveiling the Dynamics of Generative Transformers

Muhammed Emrullah Ildiz, Yixiao HUANG, Yingcong Li et al.

ICML 2024arXiv:2402.13512
36
citations

LLM Maybe LongLM: SelfExtend LLM Context Window Without Tuning

Hongye Jin, Xiaotian Han, Jingfeng Yang et al.

ICML 2024spotlight

MSGNet: Learning Multi-Scale Inter-series Correlations for Multivariate Time Series Forecasting

Wanlin Cai, Yuxuan Liang, Xianggen Liu et al.

AAAI 2024paperarXiv:2401.00423
185
citations

One Meta-tuned Transformer is What You Need for Few-shot Learning

Xu Yang, Huaxiu Yao, Ying WEI

ICML 2024spotlight

Polynomial-based Self-Attention for Table Representation Learning

Jayoung Kim, Yehjin Shin, Jeongwhan Choi et al.

ICML 2024arXiv:2312.07753
3
citations

PolyRoom: Room-aware Transformer for Floorplan Reconstruction

Yuzhou Liu, Lingjie Zhu, Xiaodong Ma et al.

ECCV 2024arXiv:2407.10439
8
citations

Self-attention Networks Localize When QK-eigenspectrum Concentrates

Han Bao, Ryuichiro Hataya, Ryo Karakida

ICML 2024arXiv:2402.02098
11
citations

SMFANet: A Lightweight Self-Modulation Feature Aggregation Network for Efficient Image Super-Resolution

mingjun zheng, Long Sun, Jiangxin Dong et al.

ECCV 2024
72
citations

Spatial Transform Decoupling for Oriented Object Detection

Hongtian Yu, Yunjie Tian, Qixiang Ye et al.

AAAI 2024paperarXiv:2308.10561
52
citations

Towards Automated Movie Trailer Generation

Dawit Argaw Argaw, Mattia Soldan, Alejandro Pardo et al.

CVPR 2024arXiv:2404.03477
10
citations

Towards Causal Foundation Model: on Duality between Optimal Balancing and Attention

Jiaqi Zhang, Joel Jennings, Agrin Hilmkil et al.

ICML 2024

Vision Transformer Off-the-Shelf: A Surprising Baseline for Few-Shot Class-Agnostic Counting

Zhicheng Wang, Liwen Xiao, Zhiguo Cao et al.

AAAI 2024paperarXiv:2305.04440
29
citations

What Improves the Generalization of Graph Transformers? A Theoretical Dive into the Self-attention and Positional Encoding

Hongkang Li, Meng Wang, Tengfei Ma et al.

ICML 2024arXiv:2406.01977
19
citations