"text-image alignment" Papers

13 papers found

Continuous Concepts Removal in Text-to-image Diffusion Models

Tingxu Han, Weisong Sun, Yanrong Hu et al.

NEURIPS 2025arXiv:2412.00580
3
citations

Diffusion Model as a Noise-Aware Latent Reward Model for Step-Level Preference Optimization

Tao Zhang, Cheng Da, Kun Ding et al.

NEURIPS 2025arXiv:2502.01051
16
citations

Enhancing Reward Models for High-quality Image Generation: Beyond Text-Image Alignment

ying ba, Tianyu Zhang, Yalong Bai et al.

ICCV 2025arXiv:2507.19002
6
citations

InfiniteYou: Flexible Photo Recrafting While Preserving Your Identity

Liming Jiang, Qing Yan, Yumin Jia et al.

ICCV 2025highlightarXiv:2503.16418
29
citations

LMM4LMM: Benchmarking and Evaluating Large-multimodal Image Generation with LMMs

Jiarui Wang, Huiyu Duan, Yu Zhao et al.

ICCV 2025highlightarXiv:2504.08358
16
citations

Multimodal Causal Reasoning for UAV Object Detection

Nianxin Li, Mao Ye, Lihua Zhou et al.

NEURIPS 2025

Multimodal LLMs as Customized Reward Models for Text-to-Image Generation

Shijie Zhou, Ruiyi Zhang, Huaisheng Zhu et al.

ICCV 2025arXiv:2507.21391
7
citations

Rethinking Cross-Modal Interaction in Multimodal Diffusion Transformers

Zhengyao Lyu, Tianlin Pan, Chenyang Si et al.

ICCV 2025arXiv:2506.07986
6
citations

SILMM: Self-Improving Large Multimodal Models for Compositional Text-to-Image Generation

Leigang Qu, Haochuan Li, Wenjie Wang et al.

CVPR 2025arXiv:2412.05818
10
citations

DiPrompT: Disentangled Prompt Tuning for Multiple Latent Domain Generalization in Federated Learning

Sikai Bai, Jie ZHANG, Song Guo et al.

CVPR 2024arXiv:2403.08506
28
citations

ODM: A Text-Image Further Alignment Pre-training Approach for Scene Text Detection and Spotting

Chen Duan, Pei Fu, Shan Guo et al.

CVPR 2024arXiv:2403.00303
16
citations

Referee Can Play: An Alternative Approach to Conditional Generation via Model Inversion

Xuantong Liu, Tianyang Hu, Wenjia Wang et al.

ICML 2024arXiv:2402.16305
4
citations

Text-Image Alignment for Diffusion-Based Perception

Neehar Kondapaneni, Markus Marks, Manuel Knott et al.

CVPR 2024arXiv:2310.00031
53
citations