"cross-modal feature fusion" Papers
3 papers found
Conference
Prompt as Knowledge Bank: Boost Vision-language model via Structural Representation for zero-shot medical detection
Yuguang Yang, Tongfei Chen, Haoyu Huang et al.
ICLR 2025arXiv:2502.16223
2
citations
Towards Real-world Event-guided Low-light Video Enhancement and Deblurring
Taewoo Kim, Jaeseok Jeong, Hoonhee Cho et al.
ECCV 2024arXiv:2408.14916
19
citations
Unleashing Text-to-Image Diffusion Prior for Zero-Shot Image Captioning
Jianjie Luo, Jingwen Chen, Yehao Li et al.
ECCV 2024arXiv:2501.00437
2
citations