Poster "latent diffusion models" Papers
56 papers found • Page 1 of 2
Conference
$\text{I}^2\text{AM}$: Interpreting Image-to-Image Latent Diffusion Models via Bi-Attribution Maps
Junseo Park, Hyeryung Jang
AnyDressing: Customizable Multi-Garment Virtual Dressing via Latent Diffusion Models
Xinghui Li, Qichao Sun, Pengze Zhang et al.
Boosting Latent Diffusion with Perceptual Objectives
Tariq Berrada, Pietro Astolfi, Melissa Hall et al.
CADMorph: Geometry‑Driven Parametric CAD Editing via a Plan–Generate–Verify Loop
Weijian Ma, Shizhao Sun, Ruiyu Wang et al.
Diffusion-4K: Ultra-High-Resolution Image Synthesis with Latent Diffusion Models
Jinjin Zhang, qiuyu Huang, Junjie Liu et al.
DiffVsgg: Diffusion-Driven Online Video Scene Graph Generation
Mu Chen, Liulei Li, Wenguan Wang et al.
DiTTo-TTS: Diffusion Transformers for Scalable Text-to-Speech without Domain-Specific Factors
Keon Lee, Dong Won Kim, Jaehyeon Kim et al.
Dual Prompting Image Restoration with Diffusion Transformers
Dehong Kong, Fan Li, Zhixin Wang et al.
FaceShot: Bring Any Character into Life
Junyao Gao, Yanan Sun, Fei Shen et al.
FAM Diffusion: Frequency and Attention Modulation for High-Resolution Image Generation with Stable Diffusion
Haosen Yang, Adrian Bulat, Isma Hadji et al.
FDS: Frequency-Aware Denoising Score for Text-Guided Latent Diffusion Image Editing
Yufan Ren, Zicong Jiang, Tong Zhang et al.
FedBiP: Heterogeneous One-Shot Federated Learning with Personalized Latent Diffusion Models
Haokun Chen, Hang Li, Yao Zhang et al.
FICGen: Frequency-Inspired Contextual Disentanglement for Layout-driven Degraded Image Generation
Wenzhuang Wang, Yifan Zhao, Mingcan Ma et al.
Flow to the Mode: Mode-Seeking Diffusion Autoencoders for State-of-the-Art Image Tokenization
Kyle Sargent, Kyle Hsu, Justin Johnson et al.
GenDeg: Diffusion-based Degradation Synthesis for Generalizable All-In-One Image Restoration
Sudarshan Rajagopalan, Nithin Gopalakrishnan Nair, Jay Paranjape et al.
InfGen: A Resolution-Agnostic Paradigm for Scalable Image Synthesis
Tao Han, Wanghan Xu, Junchao Gong et al.
Instruct-CLIP: Improving Instruction-Guided Image Editing with Automated Data Refinement Using Contrastive Learning
Sherry X. Chen, Misha Sra, Pradeep Sen
LATINO-PRO: LAtent consisTency INverse sOlver with PRompt Optimization
Alessio Spagnoletti, Jean Prost, Andres Almansa et al.
LD-RPS: Zero-Shot Unified Image Restoration via Latent Diffusion Recurrent Posterior Sampling
Li Huaqiu, Yong Wang, Tongwen Huang et al.
Lost in Latent Space: An Empirical Study of Latent Diffusion Models for Physics Emulation
François Rozet, Ruben Ohana, Michael McCabe et al.
MGE-LDM: Joint Latent Diffusion for Simultaneous Music Generation and Source Extraction
Yunkee Chae, Kyogu Lee
Multi-focal Conditioned Latent Diffusion for Person Image Synthesis
Jiaqi Liu, Jichao Zhang, Paolo Rota et al.
Orchid: Image Latent Diffusion for Joint Appearance and Geometry Generation
Akshay Krishnan, Xinchen Yan, Vincent Casser et al.
OSCAR: One-Step Diffusion Codec Across Multiple Bit-rates
Jinpei Guo, Yifei Ji, Zheng Chen et al.
Projection-Manifold Regularized Latent Diffusion for Robust General Image Fusion
Lei Cao, Hao Zhang, Chunyu Li et al.
Promptable 3-D Object Localization with Latent Diffusion Models
Cheng-Yao Hong, Li-Heng Wang, Tyng-Luh Liu
REPA-E: Unlocking VAE for End-to-End Tuning of Latent Diffusion Transformers
Xingjian Leng, Jaskirat Singh, Yunzhong Hou et al.
Reward Guided Latent Consistency Distillation
William Wang, Jiachen Li, Weixi Feng et al.
SALAD: Skeleton-aware Latent Diffusion for Text-driven Motion Generation and Editing
Seokhyeon Hong, Chaelin Kim, Serin Yoon et al.
Seeds of Structure: Patch PCA Reveals Universal Compositional Cues in Diffusion Models
Qingsong Wang, Zhengchao Wan, Misha Belkin et al.
Silence is Golden: Leveraging Adversarial Examples to Nullify Audio Control in LDM-based Talking-Head Generation
Yuan Gan, Jiaxu Miao, Yunze Wang et al.
SleeperMark: Towards Robust Watermark against Fine-Tuning Text-to-image Diffusion Models
Zilan Wang, Junfeng Guo, Jiacheng Zhu et al.
Sparc3D: Sparse Representation and Construction for High-Resolution 3D Shapes Modeling
Zhihao Li, Yufei Wang, Heliang Zheng et al.
StableGuard: Towards Unified Copyright Protection and Tamper Localization in Latent Diffusion Models
Haoxin Yang, Bangzhen Liu, Xuemiao Xu et al.
You Only Sample Once: Taming One-Step Text-to-Image Synthesis by Self-Cooperative Diffusion GANs
Yihong Luo, Xiaolong Chen, Xinghua Qu et al.
Your Text Encoder Can Be An Object-Level Watermarking Controller
Naresh Kumar Devulapally, Mingzhen Huang, Vishal Asnani et al.
Accelerating Image Generation with Sub-path Linear Approximation Model
Chen Xu, Tianhui Song, Weixin Feng et al.
Action2Sound: Ambient-Aware Generation of Action Sounds from Egocentric Videos
Changan Chen, Puyuan Peng, Ami Baid et al.
AFreeCA: Annotation-Free Counting for All
Adriano DAlessandro, Ali Mahdavi-Amiri, Ghassan Hamarneh
Data Augmentation via Latent Diffusion for Saliency Prediction
Bahar Aydemir, Deblina Bhattacharjee, Tong Zhang et al.
Dataset Enhancement with Instance-Level Augmentations
Orest Kupyn, Christian Rupprecht
DiffBIR: Toward Blind Image Restoration with Generative Diffusion Prior
Xinqi Lin, Jingwen He, Ziyan Chen et al.
DiffusionPen: Towards Controlling the Style of Handwritten Text Generation
KONSTANTINA NIKOLAIDOU, George Retsinas, Giorgos Sfikas et al.
Disguised Copyright Infringement of Latent Diffusion Models
Yiwei Lu, Matthew Yang, Zuoqiu Liu et al.
Emotional Speech-driven 3D Body Animation via Disentangled Latent Diffusion
Kiran Chhatre, Radek Danecek, Nikos Athanasiou et al.
FMBoost: Boosting Latent Diffusion with Flow Matching
Johannes Schusterbauer-Fischer, Ming Gui, Pingchuan Ma et al.
GIVT: Generative Infinite-Vocabulary Transformers
Michael Tschannen, Cian Eastwood, Fabian Mentzer
Hyperbolic Geometric Latent Diffusion Model for Graph Generation
Xingcheng Fu, Yisen Gao, Yuecen Wei et al.
IDOL: Unified Dual-Modal Latent Diffusion for Human-Centric Joint Video-Depth Generation
Yuanhao Zhai, Kevin Lin, Linjie Li et al.
Improving Geo-diversity of Generated Images with Contextualized Vendi Score Guidance
Reyhane Askari Hemmat, Melissa Hall, Alicia Yi Sun et al.