DiffDoctor: Diagnosing Image Diffusion Models Before Treating

3
citations
#875
in ICCV 2025
of 2701 papers
7
Top Authors
7
Data Points

Abstract

In spite of recent progress, image diffusion models still produce artifacts. A common solution is to leverage the feedback provided by quality assessment systems or human annotators to optimize the model, where images are generally rated in their entirety. In this work, we believe problem-solving starts with identification, yielding the request that the model should be aware of not just the presence of defects in an image, but their specific locations. Motivated by this, we propose DiffDoctor, a two-stage pipeline to assist image diffusion models in generating fewer artifacts. Concretely, the first stage targets developing a robust artifact detector, for which we collect a dataset of over 1M flawed synthesized images and set up an efficient human-in-the-loop annotation process, incorporating a carefully designed class-balance strategy. The learned artifact detector is then involved in the second stage to optimize the diffusion model by providing pixel-level feedback. Extensive experiments on text-to-image diffusion models demonstrate the effectiveness of our artifact detector as well as the soundness of our diagnose-then-treat design.

Citation History

Jan 24, 2026
0
Jan 26, 2026
0
Jan 26, 2026
0
Jan 28, 2026
0
Feb 13, 2026
3+3
Feb 13, 2026
3
Feb 13, 2026
3