"visual-textual alignment" Papers
8 papers found
Conference
2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining
Wenqi Zhang, Hang Zhang, Xin Li et al.
ICCV 2025highlightarXiv:2501.00958
5
citations
AdvDreamer Unveils: Are Vision-Language Models Truly Ready for Real-World 3D Variations?
Shouwei Ruan, Hanqing Liu, Yao Huang et al.
ICCV 2025highlightarXiv:2412.03002
2
citations
Anomize: Better Open Vocabulary Video Anomaly Detection
Fei Li, Wenxuan Liu, Jingjing Chen et al.
CVPR 2025arXiv:2503.18094
5
citations
MultiADS: Defect-aware Supervision for Multi-type Anomaly Detection and Segmentation in Zero-Shot Learning
Ylli Sadikaj, Hongkuan Zhou, Lavdim Halilaj et al.
ICCV 2025arXiv:2504.06740
9
citations
Reducing Hallucinations in Large Vision-Language Models via Latent Space Steering
Sheng Liu, Haotian Ye, James Y Zou
ICLR 2025
29
citations
Watch Video, Catch Keyword: Context-aware Keyword Attention for Moment Retrieval and Highlight Detection
Sung Jin Um, Dongjin Kim, Sangmin Lee et al.
AAAI 2025paperarXiv:2501.02504
4
citations
Visual Alignment Pre-training for Sign Language Translation
Peiqi Jiao, Yuecong Min, Xilin CHEN
ECCV 2024
17
citations
X-Pose: Detecting Any Keypoints
Jie Yang, AILING ZENG, Ruimao Zhang et al.
ECCV 2024arXiv:2310.08530
14
citations