Poster "zero-shot capabilities" Papers
7 papers found
Conference
Advancing Compositional Awareness in CLIP with Efficient Fine-Tuning
Amit Peleg, Naman Deep Singh, Matthias Hein
NEURIPS 2025arXiv:2505.24424
2
citations
Correlating instruction-tuning (in multimodal models) with vision-language processing (in the brain)
SUBBA REDDY OOTA, Akshett Rai Jindal, Ishani Mondal et al.
ICLR 2025arXiv:2505.20029
5
citations
ProSAM: Enhancing the Robustness of SAM-based Visual Reference Segmentation with Probabilistic Prompts
Xiaoqi Wang, Clint Sebastian, Wenbin He et al.
ICCV 2025arXiv:2506.21835
ROS-SAM: High-Quality Interactive Segmentation for Remote Sensing Moving Object
Zhe Shan, Yang Liu, Lei Zhou et al.
CVPR 2025arXiv:2503.12006
16
citations
Task Preference Optimization: Improving Multimodal Large Language Models with Vision Task Alignment
ziang yan, Zhilin Li, Yinan He et al.
CVPR 2025arXiv:2412.19326
20
citations
Explore the Potential of CLIP for Training-Free Open Vocabulary Semantic Segmentation
Tong Shao, Zhuotao Tian, Hang Zhao et al.
ECCV 2024arXiv:2407.08268
47
citations
Merging and Splitting Diffusion Paths for Semantically Coherent Panoramas
Fabio Quattrini, Vittorio Pippi, Silvia Cascianelli et al.
ECCV 2024arXiv:2408.15660
6
citations