TextSSR: Diffusion-based Data Synthesis for Scene Text Recognition

5citations

arXiv:2412.01137 Project

citations

#632

in ICCV 2025

of 2701 papers

Top Authors

Data Points

Top Authors

Xingsong Ye Yongkun Du Yunbo Tao Zhineng Chen

Topics

scene text recognition diffusion-based data synthesis region-centric text generation character-aware diffusion synthetic training data text image generation instance-level text synthesis position-glyph enhancement

Abstract

Scene text recognition (STR) suffers from challenges of either less realistic synthetic training data or the difficulty of collecting sufficient high-quality real-world data, limiting the effectiveness of trained models. Meanwhile, despite producing holistically appealing text images, diffusion-based visual text generation methods struggle to synthesize accurate and realistic instance-level text at scale. To tackle this, we introduce TextSSR: a novel pipeline for Synthesizing Scene Text Recognition training data. TextSSR targets three key synthesizing characteristics: accuracy, realism, and scalability. It achieves accuracy through a proposed region-centric text generation with position-glyph enhancement, ensuring proper character placement. It maintains realism by guiding style and appearance generation using contextual hints from surrounding text or background. This character-aware diffusion architecture enjoys precise character-level control and semantic coherence preservation, without relying on natural language prompts. Therefore, TextSSR supports large-scale generation through combinatorial text permutations. Based on these, we present TextSSR-F, a dataset of 3.55 million quality-screened text instances. Extensive experiments show that STR models trained on TextSSR-F outperform those trained on existing synthetic datasets by clear margins on common benchmarks, and further improvements are observed when mixed with real-world training data. Code is available at https://github.com/YesianRohn/TextSSR.

Citation History

Jan 24, 2026

Jan 27, 2026

Feb 3, 2026

3+1

Feb 13, 2026

5+2

Feb 13, 2026