DiffDoctor: Diagnosing Image Diffusion Models Before Treating

3citations

arXiv:2501.12382

citations

#875

in ICCV 2025

of 2701 papers

Top Authors

Data Points

Top Authors

Yiyang Wang Xi Chen Xiaogang Xu Sihui Ji Yu Liu Yujun Shen Hengshuang Zhao

Abstract

In spite of recent progress, image diffusion models still produce artifacts. A common solution is to leverage the feedback provided by quality assessment systems or human annotators to optimize the model, where images are generally rated in their entirety. In this work, we believe problem-solving starts with identification, yielding the request that the model should be aware of not just the presence of defects in an image, but their specific locations. Motivated by this, we propose DiffDoctor, a two-stage pipeline to assist image diffusion models in generating fewer artifacts. Concretely, the first stage targets developing a robust artifact detector, for which we collect a dataset of over 1M flawed synthesized images and set up an efficient human-in-the-loop annotation process, incorporating a carefully designed class-balance strategy. The learned artifact detector is then involved in the second stage to optimize the diffusion model by providing pixel-level feedback. Extensive experiments on text-to-image diffusion models demonstrate the effectiveness of our artifact detector as well as the soundness of our diagnose-then-treat design.

Citation History

Jan 24, 2026

Jan 26, 2026

Jan 28, 2026

Feb 13, 2026

3+3

Feb 13, 2026