"text-image alignment" Papers
13 papers found
Conference
Continuous Concepts Removal in Text-to-image Diffusion Models
Tingxu Han, Weisong Sun, Yanrong Hu et al.
NEURIPS 2025arXiv:2412.00580
3
citations
Diffusion Model as a Noise-Aware Latent Reward Model for Step-Level Preference Optimization
Tao Zhang, Cheng Da, Kun Ding et al.
NEURIPS 2025arXiv:2502.01051
16
citations
Enhancing Reward Models for High-quality Image Generation: Beyond Text-Image Alignment
ying ba, Tianyu Zhang, Yalong Bai et al.
ICCV 2025arXiv:2507.19002
6
citations
InfiniteYou: Flexible Photo Recrafting While Preserving Your Identity
Liming Jiang, Qing Yan, Yumin Jia et al.
ICCV 2025highlightarXiv:2503.16418
29
citations
LMM4LMM: Benchmarking and Evaluating Large-multimodal Image Generation with LMMs
Jiarui Wang, Huiyu Duan, Yu Zhao et al.
ICCV 2025highlightarXiv:2504.08358
16
citations
Multimodal Causal Reasoning for UAV Object Detection
Nianxin Li, Mao Ye, Lihua Zhou et al.
NEURIPS 2025
Multimodal LLMs as Customized Reward Models for Text-to-Image Generation
Shijie Zhou, Ruiyi Zhang, Huaisheng Zhu et al.
ICCV 2025arXiv:2507.21391
7
citations
Rethinking Cross-Modal Interaction in Multimodal Diffusion Transformers
Zhengyao Lyu, Tianlin Pan, Chenyang Si et al.
ICCV 2025arXiv:2506.07986
6
citations
SILMM: Self-Improving Large Multimodal Models for Compositional Text-to-Image Generation
Leigang Qu, Haochuan Li, Wenjie Wang et al.
CVPR 2025arXiv:2412.05818
10
citations
DiPrompT: Disentangled Prompt Tuning for Multiple Latent Domain Generalization in Federated Learning
Sikai Bai, Jie ZHANG, Song Guo et al.
CVPR 2024arXiv:2403.08506
28
citations
ODM: A Text-Image Further Alignment Pre-training Approach for Scene Text Detection and Spotting
Chen Duan, Pei Fu, Shan Guo et al.
CVPR 2024arXiv:2403.00303
16
citations
Referee Can Play: An Alternative Approach to Conditional Generation via Model Inversion
Xuantong Liu, Tianyang Hu, Wenjia Wang et al.
ICML 2024arXiv:2402.16305
4
citations
Text-Image Alignment for Diffusion-Based Perception
Neehar Kondapaneni, Markus Marks, Manuel Knott et al.
CVPR 2024arXiv:2310.00031
53
citations