"text-to-image generation" Papers

222 papers found • Page 2 of 5

FineLIP: Extending CLIP’s Reach via Fine-Grained Alignment with Longer Text Inputs

Mothilal Asokan, Kebin wu, Fatima Albreiki

CVPR 2025arXiv:2504.01916
15
citations

Flowing from Words to Pixels: A Noise-Free Framework for Cross-Modality Evolution

Qihao Liu, Xi Yin, Alan L. Yuille et al.

CVPR 2025highlightarXiv:2412.15213
12
citations

Focus-N-Fix: Region-Aware Fine-Tuning for Text-to-Image Generation

Xiaoying Xing, Avinab Saha, Junfeng He et al.

CVPR 2025highlightarXiv:2501.06481
4
citations

FreeCus: Free Lunch Subject-driven Customization in Diffusion Transformers

Yanbing Zhang, Zhe Wang, Qin Zhou et al.

ICCV 2025arXiv:2507.15249
1
citations

Free-Lunch Color-Texture Disentanglement for Stylized Image Generation

Jiang Qin, Alexandra Gomez-Villa, Senmao Li et al.

NEURIPS 2025arXiv:2503.14275
3
citations

From Cradle to Cane: A Two-Pass Framework for High-Fidelity Lifespan Face Aging

Tao Liu, Dafeng Zhang, Gengchen Li et al.

NEURIPS 2025arXiv:2506.20977
1
citations

Goku: Flow Based Video Generative Foundation Models

Shoufa Chen, Chongjian GE, Yuqi Zhang et al.

CVPR 2025highlightarXiv:2502.04896
54
citations

Growth Inhibitors for Suppressing Inappropriate Image Concepts in Diffusion Models

Die Chen, Zhiwen Li, Mingyuan Fan et al.

ICLR 2025arXiv:2408.01014
8
citations

Guided Score identity Distillation for Data-Free One-Step Text-to-Image Generation

Mingyuan Zhou, Zhendong Wang, Huangjie Zheng et al.

ICLR 2025arXiv:2406.01561
4
citations

Halton Scheduler for Masked Generative Image Transformer

Victor Besnier, Mickael Chen, David Hurych et al.

ICLR 2025arXiv:2503.17076
24
citations

Hand1000: Generating Realistic Hands from Text with Only 1,000 Images

Haozhuo Zhang, Bin Zhu, Yu Cao et al.

AAAI 2025paperarXiv:2408.15461
7
citations

HiFlow: Training-free High-Resolution Image Generation with Flow-Aligned Guidance

Jiazi Bu, Pengyang Ling, Yujie Zhou et al.

NEURIPS 2025arXiv:2504.06232
8
citations

ImageGen-CoT: Enhancing Text-to-Image In-context Learning with Chain-of-Thought Reasoning

Jiaqi Liao, Zhengyuan Yang, Linjie Li et al.

ICCV 2025arXiv:2503.19312
22
citations

ImgEdit: A Unified Image Editing Dataset and Benchmark

Yang Ye, Xianyi He, Zongjian Li et al.

NEURIPS 2025arXiv:2505.20275
98
citations

Improving Text-to-Image Consistency via Automatic Prompt Optimization

Melissa Hall, Michal Drozdzal, Oscar Mañas et al.

ICLR 2025arXiv:2403.17804
71
citations

Infinity∞: Scaling Bitwise AutoRegressive Modeling for High-Resolution Image Synthesis

Jian Han, Jinlai Liu, Yi Jiang et al.

CVPR 2025arXiv:2412.04431
201
citations

Information Theoretic Text-to-Image Alignment

Chao Wang, Giulio Franzese, alessandro finamore et al.

ICLR 2025arXiv:2405.20759
4
citations

Instruct-CLIP: Improving Instruction-Guided Image Editing with Automated Data Refinement Using Contrastive Learning

Sherry X. Chen, Misha Sra, Pradeep Sen

CVPR 2025arXiv:2503.18406
4
citations

Janus-Pro-R1: Advancing Collaborative Visual Comprehension and Generation via Reinforcement Learning

Kaihang Pan, Yang Wu, Wendong Bu et al.

NEURIPS 2025arXiv:2506.01480
7
citations

Know "No" Better: A Data-Driven Approach for Enhancing Negation Awareness in CLIP

Junsung Park, Jungbeom Lee, Jongyoon Song et al.

ICCV 2025arXiv:2501.10913
14
citations

Language-Guided Image Tokenization for Generation

Kaiwen Zha, Lijun Yu, Alireza Fathi et al.

CVPR 2025arXiv:2412.05796
25
citations

Large-Scale Text-to-Image Model with Inpainting is a Zero-Shot Subject-Driven Image Generator

Chaehun Shin, Jooyoung Choi, Heeseung Kim et al.

CVPR 2025arXiv:2411.15466
37
citations

LaTexBlend: Scaling Multi-concept Customized Generation with Latent Textual Blending

Jian Jin, Zhenbo Yu, Yang Shen et al.

CVPR 2025highlightarXiv:2503.06956
6
citations

LayerCraft: Enhancing Text-to-Image Generation with CoT Reasoning and Layered Object Integration

Yuyao Zhang, Jinghao Li, Yu-Wing Tai

NEURIPS 2025arXiv:2504.00010
7
citations

Learning Few-Step Diffusion Models by Trajectory Distribution Matching

Yihong Luo, Tianyang Hu, Jiacheng Sun et al.

ICCV 2025arXiv:2503.06674
13
citations

Less is More: Masking Elements in Image Condition Features Avoids Content Leakages in Style Transfer Diffusion Models

Lin Zhu, Xinbing Wang, Chenghu Zhou et al.

ICLR 2025arXiv:2502.07466

LLM4GEN: Leveraging Semantic Representation of LLMs for Text-to-Image Generation

Mushui Liu, Yuhang Ma, Zhen Yang et al.

AAAI 2025paperarXiv:2407.00737
33
citations

LMM4LMM: Benchmarking and Evaluating Large-multimodal Image Generation with LMMs

Jiarui Wang, Huiyu Duan, Yu Zhao et al.

ICCV 2025highlightarXiv:2504.08358
16
citations

LoRA-X: Bridging Foundation Models with Training-Free Cross-Model Adaptation

Farzad Farhadzadeh, Debasmit Das, Shubhankar Borse et al.

ICLR 2025arXiv:2501.16559
6
citations

Lumina-Image 2.0: A Unified and Efficient Image Generative Framework

Qi Qin, Le Zhuo, Yi Xin et al.

ICCV 2025arXiv:2503.21758
58
citations

Make It Count: Text-to-Image Generation with an Accurate Number of Objects

Lital Binyamin, Yoad Tewel, Hilit Segev et al.

CVPR 2025arXiv:2406.10210
33
citations

MCCD: Multi-Agent Collaboration-based Compositional Diffusion for Complex Text-to-Image Generation

Mingcheng Li, Xiaolu Hou, Ziyang Liu et al.

CVPR 2025arXiv:2505.02648
12
citations

Measuring And Improving Engagement of Text-to-Image Generation Models

Varun Khurana, Yaman Singla, Jayakumar Subramanian et al.

ICLR 2025
2
citations

Memories of Forgotten Concepts

Matan Rusanovsky, Shimon Malnick, Amir Jevnisek et al.

CVPR 2025highlightarXiv:2412.00782
6
citations

Memory-Efficient Visual Autoregressive Modeling with Scale-Aware KV Cache Compression

Kunjun Li, Zigeng Chen, Cheng-Yen Yang et al.

NEURIPS 2025arXiv:2505.19602
9
citations

MJ-Bench: Is Your Multimodal Reward Model Really a Good Judge for Text-to-Image Generation?

Zhaorun Chen, Zichen Wen, Yichao Du et al.

NEURIPS 2025arXiv:2407.04842
60
citations

Multi-Group Proportional Representations for Text-to-Image Models

Sangwon Jung, Alex Oesterling, Claudio Mayrink Verdun et al.

CVPR 2025arXiv:2505.24023
2
citations

Multimodal LLMs as Customized Reward Models for Text-to-Image Generation

Shijie Zhou, Ruiyi Zhang, Huaisheng Zhu et al.

ICCV 2025arXiv:2507.21391
7
citations

Multi-party Collaborative Attention Control for Image Customization

Han Yang, Chuanguang Yang, Qiuli Wang et al.

CVPR 2025arXiv:2505.01428
5
citations

Neighboring Autoregressive Modeling for Efficient Visual Generation

Yefei He, Yuanyu He, Shaoxuan He et al.

ICCV 2025arXiv:2503.10696
19
citations

NL-Eye: Abductive NLI For Images

Mor Ventura, Michael Toker, Nitay Calderon et al.

ICLR 2025arXiv:2410.02613
3
citations

ORIGEN: Zero-Shot 3D Orientation Grounding in Text-to-Image Generation

Yunhong Min, Daehyeon Choi, Kyeongmin Yeo et al.

NEURIPS 2025arXiv:2503.22194
3
citations

Parallel Sequence Modeling via Generalized Spatial Propagation Network

Hongjun Wang, Wonmin Byeon, Jiarui Xu et al.

CVPR 2025arXiv:2501.12381
3
citations

Penalizing Boundary Activation for Object Completeness in Diffusion Models

Haoyang Xu, Tianhao Zhao, Sibei Yang et al.

ICCV 2025arXiv:2509.16968

Personalized Preference Fine-tuning of Diffusion Models

Meihua Dang, Anikait Singh, Linqi Zhou et al.

CVPR 2025arXiv:2501.06655
15
citations

PLADIS: Pushing the Limits of Attention in Diffusion Models at Inference Time by Leveraging Sparsity

Kwanyoung Kim, Byeongsu Sim

ICCV 2025arXiv:2503.07677
1
citations

PoGDiff: Product-of-Gaussians Diffusion Models for Imbalanced Text-to-Image Generation

Ziyan Wang, Sizhe Wei, Xiaoming Huo et al.

NEURIPS 2025arXiv:2502.08106
1
citations

Precise, Fast, and Low-cost Concept Erasure in Value Space: Orthogonal Complement Matters

Yuan Wang, Ouxiang Li, Tingting Mu et al.

CVPR 2025arXiv:2412.06143
18
citations

Precise Parameter Localization for Textual Generation in Diffusion Models

Łukasz Staniszewski, Bartosz Cywiński, Franziska Boenisch et al.

ICLR 2025arXiv:2502.09935
4
citations

Random Conditioning with Distillation for Data-Efficient Diffusion Model Compression

Dohyun Kim, Sehwan Park, GeonHee Han et al.

CVPR 2025arXiv:2504.02011
1
citations