Poster "vision transformers" Papers

96 papers found • Page 1 of 2

A Circular Argument: Does RoPE need to be Equivariant for Vision?

Chase van de Geijn, Timo Lüddecke, Polina Turishcheva et al.

NEURIPS 2025arXiv:2511.08368
2
citations

Adaptive Part Learning for Fine-Grained Generalized Category Discovery: A Plug-and-Play Enhancement

Qiyuan Dai, Hanzhuo Huang, Yu Wu et al.

CVPR 2025arXiv:2507.06928
7
citations

Alias-Free ViT: Fractional Shift Invariance via Linear Attention

Hagay Michaeli, Daniel Soudry

NEURIPS 2025arXiv:2510.22673

A Theoretical Analysis of Self-Supervised Learning for Vision Transformers

Yu Huang, Zixin Wen, Yuejie Chi et al.

ICLR 2025arXiv:2403.02233
3
citations

BHViT: Binarized Hybrid Vision Transformer

Tian Gao, Yu Zhang, Zhiyuan Zhang et al.

CVPR 2025arXiv:2503.02394
8
citations

BiggerGait: Unlocking Gait Recognition with Layer-wise Representations from Large Vision Models

Dingqiang Ye, Chao Fan, Zhanbo Huang et al.

NEURIPS 2025arXiv:2505.18132
6
citations

Brain Mapping with Dense Features: Grounding Cortical Semantic Selectivity in Natural Images With Vision Transformers

Andrew Luo, Jacob Yeung, Rushikesh Zawar et al.

ICLR 2025arXiv:2410.05266
12
citations

ChA-MAEViT: Unifying Channel-Aware Masked Autoencoders and Multi-Channel Vision Transformers for Improved Cross-Channel Learning

Chau Pham, Juan C. Caicedo, Bryan Plummer

NEURIPS 2025arXiv:2503.19331
5
citations

Charm: The Missing Piece in ViT Fine-Tuning for Image Aesthetic Assessment

Fatemeh Behrad, Tinne Tuytelaars, Johan Wagemans

CVPR 2025arXiv:2504.02522
3
citations

DA-VPT: Semantic-Guided Visual Prompt Tuning for Vision Transformers

Li Ren, Chen Chen, Liqiang Wang et al.

CVPR 2025arXiv:2505.23694
8
citations

Discovering Influential Neuron Path in Vision Transformers

Yifan Wang, Yifei Liu, Yingdong Shi et al.

ICLR 2025arXiv:2503.09046
4
citations

DiTASK: Multi-Task Fine-Tuning with Diffeomorphic Transformations

Krishna Sri Ipsit Mantri, Carola-Bibiane Schönlieb, Bruno Ribeiro et al.

CVPR 2025arXiv:2502.06029
5
citations

Effective Interplay between Sparsity and Quantization: From Theory to Practice

Simla Harma, Ayan Chakraborty, Elizaveta Kostenok et al.

ICLR 2025arXiv:2405.20935
19
citations

Elastic ViTs from Pretrained Models without Retraining

Walter Simoncini, Michael Dorkenwald, Tijmen Blankevoort et al.

NEURIPS 2025arXiv:2510.17700

Energy Landscape-Aware Vision Transformers: Layerwise Dynamics and Adaptive Task-Specific Training via Hopfield States

Runze Xia, Richard Jiang

NEURIPS 2025

FLOPS: Forward Learning with OPtimal Sampling

Tao Ren, Zishi Zhang, Jinyang Jiang et al.

ICLR 2025arXiv:2410.05966
2
citations

GPLQ: A General, Practical, and Lightning QAT Method for Vision Transformers

Guang Liang, Xinyao Liu, Jianxin Wu

NEURIPS 2025arXiv:2506.11784
4
citations

Hypergraph Vision Transformers: Images are More than Nodes, More than Edges

Joshua Fixelle

CVPR 2025arXiv:2504.08710
9
citations

Improving Adversarial Transferability on Vision Transformers via Forward Propagation Refinement

Yuchen Ren, Zhengyu Zhao, Chenhao Lin et al.

CVPR 2025arXiv:2503.15404
5
citations

Learning Occlusion-Robust Vision Transformers for Real-Time UAV Tracking

You Wu, Xucheng Wang, Xiangyang Yang et al.

CVPR 2025arXiv:2504.09228
20
citations

LeGrad: An Explainability Method for Vision Transformers via Feature Formation Sensitivity

Walid Bousselham, Angie Boggust, Sofian Chaybouti et al.

ICCV 2025arXiv:2404.03214
25
citations

LevAttention: Time, Space and Streaming Efficient Algorithm for Heavy Attentions

Ravindran Kannan, Chiranjib Bhattacharyya, Praneeth Kacham et al.

ICLR 2025arXiv:2410.05462
2
citations

Linear Differential Vision Transformer: Learning Visual Contrasts via Pairwise Differentials

Yifan Pu, Jixuan Ying, Qixiu Li et al.

NEURIPS 2025arXiv:2511.00833
1
citations

Locality Alignment Improves Vision-Language Models

Ian Covert, Tony Sun, James Y Zou et al.

ICLR 2025arXiv:2410.11087
11
citations

LookWhere? Efficient Visual Recognition by Learning Where to Look and What to See from Self-Supervision

Anthony Fuller, Yousef Yassin, Junfeng Wen et al.

NEURIPS 2025arXiv:2505.18051
2
citations

L-SWAG: Layer-Sample Wise Activation with Gradients Information for Zero-Shot NAS on Vision Transformers

Sofia Casarin, Sergio Escalera, Oswald Lanz

CVPR 2025arXiv:2505.07300
2
citations

MambaIRv2: Attentive State Space Restoration

Hang Guo, Yong Guo, Yaohua Zha et al.

CVPR 2025arXiv:2411.15269
84
citations

MambaOut: Do We Really Need Mamba for Vision?

Weihao Yu, Xinchao Wang

CVPR 2025arXiv:2405.07992
193
citations

Metric-Driven Attributions for Vision Transformers

Chase Walker, Sumit Jha, Rickard Ewetz

ICLR 2025
1
citations

Modeling Human Gaze Behavior with Diffusion Models for Unified Scanpath Prediction

Giuseppe Cartella, Vittorio Cuculo, Alessandro D'Amelio et al.

ICCV 2025arXiv:2507.23021
1
citations

Morphing Tokens Draw Strong Masked Image Models

Taekyung Kim, Byeongho Heo, Dongyoon Han

ICLR 2025arXiv:2401.00254
3
citations

Multi-Kernel Correlation-Attention Vision Transformer for Enhanced Contextual Understanding and Multi-Scale Integration

Hongkang Zhang, Shao-Lun Huang, Ercan KURUOGLU et al.

NEURIPS 2025

Mutual Effort for Efficiency: A Similarity-based Token Pruning for Vision Transformers in Self-Supervised Learning

Sheng Li, Qitao Tan, Yue Dai et al.

ICLR 2025

Normalize Filters! Classical Wisdom for Deep Vision

Gustavo Perez, Stella X. Yu

NEURIPS 2025arXiv:2506.04401

PatchGuard: Adversarially Robust Anomaly Detection and Localization through Vision Transformers and Pseudo Anomalies

Mojtaba Nafez, Amirhossein Koochakian, Arad Maleki et al.

CVPR 2025arXiv:2506.09237
2
citations

PolaFormer: Polarity-aware Linear Attention for Vision Transformers

Weikang Meng, Yadan Luo, Xin Li et al.

ICLR 2025arXiv:2501.15061
42
citations

Prompt-CAM: Making Vision Transformers Interpretable for Fine-Grained Analysis

Arpita Chowdhury, Dipanjyoti Paul, Zheda Mai et al.

CVPR 2025arXiv:2501.09333
6
citations

Randomized-MLP Regularization Improves Domain Adaptation and Interpretability in DINOv2

Joel Valdivia Ortega, Lorenz Lamm, Franziska Eckardt et al.

NEURIPS 2025arXiv:2511.05509

Register and [CLS] tokens induce a decoupling of local and global features in large ViTs

Alexander Lappe, Martin Giese

NEURIPS 2025
3
citations

Revisiting Residual Connections: Orthogonal Updates for Stable and Efficient Deep Networks

Giyeong Oh, Woohyun Cho, Siyeol Kim et al.

NEURIPS 2025arXiv:2505.11881

Scalable Neural Network Geometric Robustness Validation via Hölder Optimisation

Yanghao Zhang, Panagiotis Kouvaros, Alessio Lomuscio

NEURIPS 2025

Semantic Alignment and Reinforcement for Data-Free Quantization of Vision Transformers

Yunshan Zhong, Yuyao Zhou, Yuxin Zhang et al.

ICCV 2025arXiv:2412.16553

Sinusoidal Initialization, Time for a New Start

Alberto Fernandez-Hernandez, Jose Mestre, Manuel F. Dolz et al.

NEURIPS 2025arXiv:2505.12909
1
citations

SonoGym: High Performance Simulation for Challenging Surgical Tasks with Robotic Ultrasound

Yunke Ao, Masoud Moghani, Mayank Mittal et al.

NEURIPS 2025arXiv:2507.01152
1
citations

Spectral State Space Model for Rotation-Invariant Visual Representation Learning

Sahar Dastani, Ali Bahri, Moslem Yazdanpanah et al.

CVPR 2025arXiv:2503.06369
4
citations

Split Adaptation for Pre-trained Vision Transformers

Lixu Wang, Bingqi Shang, Yi Li et al.

CVPR 2025arXiv:2503.00441
2
citations

Synergy and Diversity in CLIP: Enhancing Performance Through Adaptive Backbone Ensembling

Cristian Rodriguez-Opazo, Ehsan Abbasnejad, Damien Teney et al.

ICLR 2025arXiv:2405.17139
1
citations

TRUST: Test-Time Refinement using Uncertainty-Guided SSM Traverses

Sahar Dastani, Ali Bahri, Gustavo Vargas Hakim et al.

NEURIPS 2025arXiv:2509.22813

Variance-Based Pruning for Accelerating and Compressing Trained Networks

Uranik Berisha, Jens Mehnert, Alexandru Condurache

ICCV 2025arXiv:2507.12988
1
citations

ViT-EnsembleAttack: Augmenting Ensemble Models for Stronger Adversarial Transferability in Vision Transformers

Hanwen Cao, Haobo Lu, Xiaosen Wang et al.

ICCV 2025arXiv:2508.12384
1
citations
PreviousNext