Poster "cross-attention mechanisms" Papers
15 papers found
Conference
$\text{I}^2\text{AM}$: Interpreting Image-to-Image Latent Diffusion Models via Bi-Attribution Maps
Junseo Park, Hyeryung Jang
ICLR 2025
DGQ: Distribution-Aware Group Quantization for Text-to-Image Diffusion Models
Hyogon Ryu, NaHyeon Park, Hyunjung Shim
ICLR 2025arXiv:2501.04304
7
citations
Enhancing Text-to-Image Diffusion Transformer via Split-Text Conditioning
Yu Zhang, Jialei Zhou, Xinchen Li et al.
NEURIPS 2025arXiv:2505.19261
7
citations
Fair Generation without Unfair Distortions: Debiasing Text-to-Image Generation with Entanglement-Free Attention
Jeonghoon Park, Juyoung Lee, Chaeyeon Chung et al.
ICCV 2025arXiv:2506.13298
3
citations
Grounding Continuous Representations in Geometry: Equivariant Neural Fields
David Wessels, David Knigge, Riccardo Valperga et al.
ICLR 2025arXiv:2406.05753
13
citations
Improving Editability in Image Generation with Layer-wise Memory
Daneul Kim, Jaeah Lee, Jaesik Park
CVPR 2025arXiv:2505.01079
1
citations
SALAD: Skeleton-aware Latent Diffusion for Text-driven Motion Generation and Editing
Seokhyeon Hong, Chaelin Kim, Serin Yoon et al.
CVPR 2025arXiv:2503.13836
14
citations
ViLU: Learning Vision-Language Uncertainties for Failure Prediction
Marc Lafon, Yannis Karmim, Julio Silva-Rodríguez et al.
ICCV 2025arXiv:2507.07620
2
citations
AugDETR: Improving Multi-scale Learning for Detection Transformer
Jinpeng Dong, Yutong Lin, Chen Li et al.
ECCV 2024
4
citations
Exploring Phrase-Level Grounding with Text-to-Image Diffusion Model
Danni Yang, Ruohan Dong, Jiayi Ji et al.
ECCV 2024arXiv:2407.05352
9
citations
Revealing Vision-Language Integration in the Brain with Multimodal Networks
Vighnesh Subramaniam, Colin Conwell, Christopher Wang et al.
ICML 2024arXiv:2406.14481
18
citations
Text-Conditioned Resampler For Long Form Video Understanding
Bruno Korbar, Yongqin Xian, Alessio Tonioni et al.
ECCV 2024arXiv:2312.11897
24
citations
Towards Effective Usage of Human-Centric Priors in Diffusion Models for Text-based Human Image Generation
Junyan Wang, Zhenhong Sun, Stewart Tan et al.
CVPR 2024arXiv:2403.05239
18
citations
Unveiling and Mitigating Memorization in Text-to-image Diffusion Models through Cross Attention
Jie Ren, Yaxin Li, Shenglai Zeng et al.
ECCV 2024arXiv:2403.11052
52
citations
Zero-Painter: Training-Free Layout Control for Text-to-Image Synthesis
Marianna Ohanyan, Hayk Manukyan, Zhangyang Wang et al.
CVPR 2024arXiv:2406.04032
10
citations