"latent diffusion models" Papers
72 papers found • Page 1 of 2
Conference
$\text{I}^2\text{AM}$: Interpreting Image-to-Image Latent Diffusion Models via Bi-Attribution Maps
Junseo Park, Hyeryung Jang
Animate-X: Universal Character Image Animation with Enhanced Motion Representation
Shuai Tan, Biao Gong, Xiang Wang et al.
AnyDressing: Customizable Multi-Garment Virtual Dressing via Latent Diffusion Models
Xinghui Li, Qichao Sun, Pengze Zhang et al.
Boosting Latent Diffusion with Perceptual Objectives
Tariq Berrada, Pietro Astolfi, Melissa Hall et al.
CADMorph: Geometry‑Driven Parametric CAD Editing via a Plan–Generate–Verify Loop
Weijian Ma, Shizhao Sun, Ruiyu Wang et al.
Diffusion-4K: Ultra-High-Resolution Image Synthesis with Latent Diffusion Models
Jinjin Zhang, qiuyu Huang, Junjie Liu et al.
Diffusion Models for Attribution
Xiongren Chen, Jiuyong Li, Jixue Liu et al.
DiffVsgg: Diffusion-Driven Online Video Scene Graph Generation
Mu Chen, Liulei Li, Wenguan Wang et al.
DiTTo-TTS: Diffusion Transformers for Scalable Text-to-Speech without Domain-Specific Factors
Keon Lee, Dong Won Kim, Jaehyeon Kim et al.
Dual Prompting Image Restoration with Diffusion Transformers
Dehong Kong, Fan Li, Zhixin Wang et al.
Explore In-Context Segmentation via Latent Diffusion Models
Chaoyang Wang, Xiangtai Li, Henghui Ding et al.
FaceShot: Bring Any Character into Life
Junyao Gao, Yanan Sun, Fei Shen et al.
FAM Diffusion: Frequency and Attention Modulation for High-Resolution Image Generation with Stable Diffusion
Haosen Yang, Adrian Bulat, Isma Hadji et al.
FDS: Frequency-Aware Denoising Score for Text-Guided Latent Diffusion Image Editing
Yufan Ren, Zicong Jiang, Tong Zhang et al.
FedBiP: Heterogeneous One-Shot Federated Learning with Personalized Latent Diffusion Models
Haokun Chen, Hang Li, Yao Zhang et al.
FICGen: Frequency-Inspired Contextual Disentanglement for Layout-driven Degraded Image Generation
Wenzhuang Wang, Yifan Zhao, Mingcan Ma et al.
Flow to the Mode: Mode-Seeking Diffusion Autoencoders for State-of-the-Art Image Tokenization
Kyle Sargent, Kyle Hsu, Justin Johnson et al.
GenDeg: Diffusion-based Degradation Synthesis for Generalizable All-In-One Image Restoration
Sudarshan Rajagopalan, Nithin Gopalakrishnan Nair, Jay Paranjape et al.
InfGen: A Resolution-Agnostic Paradigm for Scalable Image Synthesis
Tao Han, Wanghan Xu, Junchao Gong et al.
Instruct-CLIP: Improving Instruction-Guided Image Editing with Automated Data Refinement Using Contrastive Learning
Sherry X. Chen, Misha Sra, Pradeep Sen
LATINO-PRO: LAtent consisTency INverse sOlver with PRompt Optimization
Alessio Spagnoletti, Jean Prost, Andres Almansa et al.
LD-RPS: Zero-Shot Unified Image Restoration via Latent Diffusion Recurrent Posterior Sampling
Li Huaqiu, Yong Wang, Tongwen Huang et al.
Lost in Latent Space: An Empirical Study of Latent Diffusion Models for Physics Emulation
François Rozet, Ruben Ohana, Michael McCabe et al.
MGE-LDM: Joint Latent Diffusion for Simultaneous Music Generation and Source Extraction
Yunkee Chae, Kyogu Lee
Multi-focal Conditioned Latent Diffusion for Person Image Synthesis
Jiaqi Liu, Jichao Zhang, Paolo Rota et al.
Orchid: Image Latent Diffusion for Joint Appearance and Geometry Generation
Akshay Krishnan, Xinchen Yan, Vincent Casser et al.
OSCAR: One-Step Diffusion Codec Across Multiple Bit-rates
Jinpei Guo, Yifei Ji, Zheng Chen et al.
Pixel Is Not a Barrier: An Effective Evasion Attack for Pixel-Domain Diffusion Models
Chun-Yen Shih, Li-Xuan Peng, Jia-Wei Liao et al.
Projection-Manifold Regularized Latent Diffusion for Robust General Image Fusion
Lei Cao, Hao Zhang, Chunyu Li et al.
Promptable 3-D Object Localization with Latent Diffusion Models
Cheng-Yao Hong, Li-Heng Wang, Tyng-Luh Liu
REPA-E: Unlocking VAE for End-to-End Tuning of Latent Diffusion Transformers
Xingjian Leng, Jaskirat Singh, Yunzhong Hou et al.
RepLDM: Reprogramming Pretrained Latent Diffusion Models for High-Quality, High-Efficiency, High-Resolution Image Generation
Boyuan Cao, Jiaxin Ye, Yujie Wei et al.
Reward Guided Latent Consistency Distillation
William Wang, Jiachen Li, Weixi Feng et al.
SALAD: Skeleton-aware Latent Diffusion for Text-driven Motion Generation and Editing
Seokhyeon Hong, Chaelin Kim, Serin Yoon et al.
Seeds of Structure: Patch PCA Reveals Universal Compositional Cues in Diffusion Models
Qingsong Wang, Zhengchao Wan, Misha Belkin et al.
Silence is Golden: Leveraging Adversarial Examples to Nullify Audio Control in LDM-based Talking-Head Generation
Yuan Gan, Jiaxu Miao, Yunze Wang et al.
SleeperMark: Towards Robust Watermark against Fine-Tuning Text-to-image Diffusion Models
Zilan Wang, Junfeng Guo, Jiacheng Zhu et al.
Sparc3D: Sparse Representation and Construction for High-Resolution 3D Shapes Modeling
Zhihao Li, Yufei Wang, Heliang Zheng et al.
StableGuard: Towards Unified Copyright Protection and Tamper Localization in Latent Diffusion Models
Haoxin Yang, Bangzhen Liu, Xuemiao Xu et al.
Text2PDE: Latent Diffusion Models for Accessible Physics Simulation
Anthony Zhou, Zijie Li, Michael Schneier et al.
VFRTok: Variable Frame Rates Video Tokenizer with Duration-Proportional Information Assumption
Tianxiong Zhong, Xingye Tian, Boyuan Jiang et al.
VVRec: Reconstruction Attacks on DL-based Volumetric Video Upstreaming via Latent Diffusion Model with Gamma Distribution
Rui Lu, Bihai Zhang, Dan Wang
You Only Sample Once: Taming One-Step Text-to-Image Synthesis by Self-Cooperative Diffusion GANs
Yihong Luo, Xiaolong Chen, Xinghua Qu et al.
Your Text Encoder Can Be An Object-Level Watermarking Controller
Naresh Kumar Devulapally, Mingzhen Huang, Vishal Asnani et al.
Accelerating Image Generation with Sub-path Linear Approximation Model
Chen Xu, Tianhui Song, Weixin Feng et al.
Action2Sound: Ambient-Aware Generation of Action Sounds from Egocentric Videos
Changan Chen, Puyuan Peng, Ami Baid et al.
Adapt and Diffuse: Sample-adaptive Reconstruction via Latent Diffusion Models
Zalan Fabian, Berk Tinaz, Mahdi Soltanolkotabi
AEROBLADE: Training-Free Detection of Latent Diffusion Images Using Autoencoder Reconstruction Error
Jonas Ricker, Denis Lukovnikov, Asja Fischer
AFreeCA: Annotation-Free Counting for All
Adriano DAlessandro, Ali Mahdavi-Amiri, Ghassan Hamarneh
Data Augmentation via Latent Diffusion for Saliency Prediction
Bahar Aydemir, Deblina Bhattacharjee, Tong Zhang et al.