"vision transformers" Papers

122 papers found • Page 3 of 3

PaPr: Training-Free One-Step Patch Pruning with Lightweight ConvNets for Faster Inference

Tanvir Mahmud, Burhaneddin Yaman, Chun-Hao Liu et al.

ECCV 2024arXiv:2403.16020
7
citations

PDiscoFormer: Relaxing Part Discovery Constraints with Vision Transformers

Ananthu Aniraj, Cassio F. Dantas, Dino Ienco et al.

ECCV 2024arXiv:2407.04538
6
citations

Phase Concentration and Shortcut Suppression for Weakly Supervised Semantic Segmentation

Hoyong Kwon, Jaeseok Jeong, Sung-Hoon Yoon et al.

ECCV 2024

Probabilistic Conceptual Explainers: Trustworthy Conceptual Explanations for Vision Foundation Models

Hengyi Wang, Shiwei Tan, Hao Wang

ICML 2024arXiv:2406.12649
9
citations

Removing Rows and Columns of Tokens in Vision Transformer enables Faster Dense Prediction without Retraining

Diwei Su, cheng fei, Jianxu Luo

ECCV 2024
2
citations

Revealing the Dark Secrets of Extremely Large Kernel ConvNets on Robustness

Honghao Chen, Zhang Yurong, xiaokun Feng et al.

ICML 2024arXiv:2407.08972
10
citations

Robustness Tokens: Towards Adversarial Robustness of Transformers

Brian Pulfer, Yury Belousov, Slava Voloshynovskiy

ECCV 2024arXiv:2503.10191

Rotation-Agnostic Image Representation Learning for Digital Pathology

Saghir Alfasly, Abubakr Shafique, Peyman Nejat et al.

CVPR 2024arXiv:2311.08359
19
citations

Sample-specific Masks for Visual Reprogramming-based Prompting

Chengyi Cai, Zesheng Ye, Lei Feng et al.

ICML 2024spotlightarXiv:2406.03150
13
citations

SNP: Structured Neuron-level Pruning to Preserve Attention Scores

Kyunghwan Shim, Jaewoong Yun, Shinkook Choi

ECCV 2024arXiv:2404.11630
3
citations

Sparse Model Inversion: Efficient Inversion of Vision Transformers for Data-Free Applications

Zixuan Hu, Yongxian Wei, Li Shen et al.

ICML 2024arXiv:2510.27186
8
citations

Spatial Transform Decoupling for Oriented Object Detection

Hongtian Yu, Yunjie Tian, Qixiang Ye et al.

AAAI 2024paperarXiv:2308.10561
52
citations

SpecFormer: Guarding Vision Transformer Robustness via Maximum Singular Value Penalization

Xixu Hu, Runkai Zheng, Jindong Wang et al.

ECCV 2024arXiv:2402.03317
5
citations

Stitched ViTs are Flexible Vision Backbones

Zizheng Pan, Jing Liu, Haoyu He et al.

ECCV 2024arXiv:2307.00154
4
citations

Sub-token ViT Embedding via Stochastic Resonance Transformers

Dong Lao, Yangchao Wu, Tian Yu Liu et al.

ICML 2024arXiv:2310.03967
7
citations

TOP-ReID: Multi-Spectral Object Re-identification with Token Permutation

Yuhao Wang, Xuehu Liu, Pingping Zhang et al.

AAAI 2024paperarXiv:2312.09612
45
citations

Unlocking the Potential of Prompt-Tuning in Bridging Generalized and Personalized Federated Learning

wenlong deng, Christos Thrampoulidis, Xiaoxiao Li

CVPR 2024arXiv:2310.18285
21
citations

VideoMAC: Video Masked Autoencoders Meet ConvNets

Gensheng Pei, Tao Chen, Xiruo Jiang et al.

CVPR 2024arXiv:2402.19082
21
citations

Vision Transformers as Probabilistic Expansion from Learngene

Qiufeng Wang, Xu Yang, Haokun Chen et al.

ICML 2024

ViTEraser: Harnessing the Power of Vision Transformers for Scene Text Removal with SegMIM Pretraining

Dezhi Peng, Chongyu Liu, Yuliang Liu et al.

AAAI 2024paperarXiv:2306.12106
18
citations

xT: Nested Tokenization for Larger Context in Large Images

Ritwik Gupta, Shufan Li, Tyler Zhu et al.

ICML 2024arXiv:2403.01915
8
citations

Zero-TPrune: Zero-Shot Token Pruning through Leveraging of the Attention Graph in Pre-Trained Transformers

Hongjie Wang, Bhishma Dedhia, Niraj Jha

CVPR 2024arXiv:2305.17328
61
citations