"visual prompting" Papers

19 papers found

Conference

AAAI 2025 (3,028)COLM 2025 (418)CVPR 2025 (2,873)ICCV 2025 (2,701)ICLR 2025 (3,827)ICML 2025 (3,340)ISMAR 2025 (229)NEURIPS 2025 (5,858)AAAI 2024 (2,289)CVPR 2024 (2,716)ECCV 2024 (2,387)ICLR 2024 (2,297)ICML 2024 (2,635)

Paper Type

poster (24,624)paper (8,558)oral (1,594)spotlight (1,421)highlight (975)

Coarse Correspondences Boost Spatial-Temporal Reasoning in Multimodal Language Model

Benlin Liu, Yuhao Dong, Yiqin Wang et al.

CVPR 2025arXiv:2408.00754

citations

Draw-and-Understand: Leveraging Visual Prompts to Enable MLLMs to Comprehend What You Want

Weifeng Lin, Xinyu Wei, Ruichuan An et al.

ICLR 2025arXiv:2403.20271

citations

Enhancing Visual Prompting through Expanded Transformation Space and Overfitting Mitigation

Shohei Enomoto

NEURIPS 2025arXiv:2510.07823

Ferret-UI 2: Mastering Universal User Interface Understanding Across Platforms

Zhangheng LI, Keen You, Haotian Zhang et al.

ICLR 2025arXiv:2410.18967

citations

Selective Visual Prompting in Vision Mamba

Yifeng Yao, Zichen Liu, Zhenyu Cui et al.

AAAI 2025paperarXiv:2412.08947

citations

StyleKeeper: Prevent Content Leakage using Negative Visual Query Guidance

Jaeseok Jeong, Junho Kim, Youngjung Uh et al.

ICCV 2025arXiv:2510.06827

citations

WeatherGFM: Learning a Weather Generalist Foundation Model via In-context Learning

Xiangyu Zhao, Zhiwang Zhou, Wenlong Zhang et al.

ICLR 2025oralarXiv:2411.05420

citations

Attention Prompting on Image for Large Vision-Language Models

Runpeng Yu, Weihao Yu, Xinchao Wang

ECCV 2024arXiv:2409.17143

citations

DocRes: A Generalist Model Toward Unifying Document Image Restoration Tasks

Jiaxin Zhang, Dezhi Peng, Chongyu Liu et al.

CVPR 2024arXiv:2405.04408

citations

Encapsulating Knowledge in One Prompt

Qi Li, Runpeng Yu, Xinchao Wang

ECCV 2024arXiv:2407.11902

citations

Exploring the Transferability of Visual Prompting for Multimodal Large Language Models

Yichi Zhang, Yinpeng Dong, Siyuan Zhang et al.

CVPR 2024highlightarXiv:2404.11207

citations

FALIP: Visual Prompt as Foveal Attention Boosts CLIP Zero-Shot Performance

Jiedong Zhuang, Jiaqi Hu, Lianrui Mu et al.

ECCV 2024arXiv:2407.05578

citations

Finding Visual Task Vectors

Alberto Hojel, Yutong Bai, Trevor Darrell et al.

ECCV 2024arXiv:2404.05729

citations

Generative Multimodal Models are In-Context Learners

Quan Sun, Yufeng Cui, Xiaosong Zhang et al.

CVPR 2024arXiv:2312.13286

438

citations

Memory-Space Visual Prompting for Efficient Vision-Language Fine-Tuning

Shibo Jie, Yehui Tang, Ning Ding et al.

ICML 2024arXiv:2405.05615

citations

PIVOT: Iterative Visual Prompting Elicits Actionable Knowledge for VLMs

Soroush Nasiriany, Fei Xia, Wenhao Yu et al.

ICML 2024arXiv:2402.07872

188

citations

Tokenize Anything via Prompting

Ting Pan, Lulu Tang, Xinlong Wang et al.

ECCV 2024arXiv:2312.09128

citations

Unifying Image Processing as Visual Prompting Question Answering

Yihao Liu, Xiangyu Chen, Xianzheng Ma et al.

ICML 2024arXiv:2310.10513

citations

Visual Prompting for Generalized Few-shot Segmentation: A Multi-scale Approach

Mir Rayat Imtiaz Hossain, Mennatullah Siam, Leonid Sigal et al.

CVPR 2024arXiv:2404.11732

citations