Poster "video generation" Papers

73 papers found • Page 1 of 2

3DTrajMaster: Mastering 3D Trajectory for Multi-Entity Motion in Video Generation

Xiao Fu, Xian Liu, Xintao WANG et al.

ICLR 2025arXiv:2412.07759
41
citations

Accelerating Diffusion Transformers with Token-wise Feature Caching

Chang Zou, Xuyang Liu, Ting Liu et al.

ICLR 2025arXiv:2410.05317
69
citations

AID: Adapting Image2Video Diffusion Models for Instruction-guided Video Prediction

Zhen Xing, Qi Dai, Zejia Weng et al.

ICCV 2025arXiv:2406.06465
24
citations

AnyPortal: Zero-Shot Consistent Video Background Replacement

Wenshuo Gao, Xicheng Lan, Shuai Yang

ICCV 2025arXiv:2509.07472
2
citations

AR-Diffusion: Asynchronous Video Generation with Auto-Regressive Diffusion

Mingzhen Sun, Weining Wang, Li et al.

CVPR 2025arXiv:2503.07418
27
citations

Bitrate-Controlled Diffusion for Disentangling Motion and Content in Video

Xiao Li, Qi Chen, Xiulian Peng et al.

ICCV 2025arXiv:2509.08376
1
citations

CHORDS: Diffusion Sampling Accelerator with Multi-core Hierarchical ODE Solvers

Jiaqi Han, Haotian Ye, Puheng Li et al.

ICCV 2025arXiv:2507.15260

ConMo: Controllable Motion Disentanglement and Recomposition for Zero-Shot Motion Transfer

Jiayi Gao, Zijin Yin, Changcheng Hua et al.

CVPR 2025arXiv:2504.02451
8
citations

CyberHost: A One-stage Diffusion Framework for Audio-driven Talking Body Generation

Gaojie Lin, Jianwen Jiang, Chao Liang et al.

ICLR 2025
17
citations

DIVE: Taming DINO for Subject-Driven Video Editing

Yi Huang, Wei Xiong, He Zhang et al.

ICCV 2025arXiv:2412.03347
9
citations

Divot: Diffusion Powers Video Tokenizer for Comprehension and Generation

Yuying Ge, Yizhuo Li, Yixiao Ge et al.

CVPR 2025arXiv:2412.04432
9
citations

Dreamweaver: Learning Compositional World Models from Pixels

Junyeob Baek, Yi-Fu Wu, Gautam Singh et al.

ICLR 2025arXiv:2501.14174
3
citations

DuSA: Fast and Accurate Dual-Stage Sparse Attention Mechanism Accelerating Both Training and Inference

Chong Wu, Jiawang Cao, Renjie Xu et al.

NEURIPS 2025

ECHOPulse: ECG Controlled Echocardio-gram Video Generation

Yiwei Li, Sekeun Kim, Zihao Wu et al.

ICLR 2025arXiv:2410.03143
5
citations

Epona: Autoregressive Diffusion World Model for Autonomous Driving

Kaiwen Zhang, Zhenyu Tang, Xiaotao Hu et al.

ICCV 2025arXiv:2506.24113
30
citations

Faster Inference of Flow-Based Generative Models via Improved Data-Noise Coupling

Aram Davtyan, Leello Dadi, Volkan Cevher et al.

ICLR 2025
5
citations

Framer: Interactive Frame Interpolation

Wen Wang, Qiuyu Wang, Kecheng Zheng et al.

ICLR 2025arXiv:2410.18978
20
citations

Free-Form Motion Control: Controlling the 6D Poses of Camera and Objects in Video Generation

Xincheng Shuai, Henghui Ding, Zhenyuan Qin et al.

ICCV 2025arXiv:2501.01425
4
citations

Generating, Fast and Slow: Scalable Parallel Video Generation with Video Interface Networks

Bhishma Dedhia, David Bourgin, Krishna Kumar Singh et al.

ICCV 2025arXiv:2503.17539
1
citations

Hierarchical Flow Diffusion for Efficient Frame Interpolation

Yang Hai, Guo Wang, Tan Su et al.

CVPR 2025arXiv:2504.00380
2
citations

Importance-Based Token Merging for Efficient Image and Video Generation

Haoyu Wu, Jingyi Xu, Hieu Le et al.

ICCV 2025arXiv:2411.16720
7
citations

Improved Video VAE for Latent Video Diffusion Model

Pingyu Wu, Kai Zhu, Yu Liu et al.

CVPR 2025arXiv:2411.06449
20
citations

Improving Video Generation with Human Feedback

Jie Liu, Gongye Liu, Jiajun Liang et al.

NEURIPS 2025arXiv:2501.13918
127
citations

Inference-Time Scaling for Flow Models via Stochastic Generation and Rollover Budget Forcing

Jaihoon Kim, Taehoon Yoon, Jisung Hwang et al.

NEURIPS 2025arXiv:2503.19385
24
citations

InsViE-1M: Effective Instruction-based Video Editing with Elaborate Dataset Construction

Yuhui WU, Liyi Chen, Ruibin Li et al.

ICCV 2025arXiv:2503.20287
20
citations

IRASim: A Fine-Grained World Model for Robot Manipulation

Fangqi Zhu, Hongtao Wu, Song Guo et al.

ICCV 2025arXiv:2406.14540
22
citations

LeanVAE: An Ultra-Efficient Reconstruction VAE for Video Diffusion Models

Yu Cheng, Fajie Yuan

ICCV 2025arXiv:2503.14325
6
citations

Long Context Tuning for Video Generation

Yuwei Guo, Ceyuan Yang, Ziyan Yang et al.

ICCV 2025arXiv:2503.10589
60
citations

MagicMotion: Controllable Video Generation with Dense-to-Sparse Trajectory Guidance

Quanhao Li, Zhen Xing, Rui Wang et al.

ICCV 2025arXiv:2503.16421
20
citations

MET3R: Measuring Multi-View Consistency in Generated Images

Mohammad Asim, Christopher Wewer, Thomas Wimmer et al.

CVPR 2025arXiv:2501.06336
44
citations

Mind the Time: Temporally-Controlled Multi-Event Video Generation

Ziyi Wu, Aliaksandr Siarohin, Willi Menapace et al.

CVPR 2025arXiv:2412.05263
22
citations

MotionAgent: Fine-grained Controllable Video Generation via Motion Field Agent

Xinyao Liao, Xianfang Zeng, Liao Wang et al.

ICCV 2025arXiv:2502.03207
7
citations

MovieDreamer: Hierarchical Generation for Coherent Long Visual Sequences

Canyu Zhao, Mingyu Liu, Wen Wang et al.

ICLR 2025arXiv:2407.16655
65
citations

Neighboring Autoregressive Modeling for Efficient Visual Generation

Yefei He, Yuanyu He, Shaoxuan He et al.

ICCV 2025arXiv:2503.10696
19
citations

One-Minute Video Generation with Test-Time Training

Jiarui Xu, Shihao Han, Karan Dalal et al.

CVPR 2025arXiv:2504.05298
67
citations

REDUCIO! Generating 1K Video within 16 Seconds using Extremely Compressed Motion Latents

Rui Tian, Qi Dai, Jianmin Bao et al.

ICCV 2025arXiv:2411.13552
7
citations

RLGF: Reinforcement Learning with Geometric Feedback for Autonomous Driving Video Generation

Tianyi Yan, Wencheng Han, xia zhou et al.

NEURIPS 2025arXiv:2509.16500
4
citations

SaRA: High-Efficient Diffusion Model Fine-tuning with Progressive Sparse Low-Rank Adaptation

Teng Hu, Jiangning Zhang, Ran Yi et al.

ICLR 2025arXiv:2409.06633
1
citations

Señorita-2M: A High-Quality Instruction-based Dataset for General Video Editing by Video Specialists

Bojia Zi, Penghui Ruan, Marco Chen et al.

NEURIPS 2025arXiv:2502.06734
27
citations

ShotBench: Expert-Level Cinematic Understanding in Vision-Language Models

Hongbo Liu, Jingwen He, Yi Jin et al.

NEURIPS 2025arXiv:2506.21356
7
citations

StableAnimator: High-Quality Identity-Preserving Human Image Animation

Shuyuan Tu, Zhen Xing, Xintong Han et al.

CVPR 2025arXiv:2411.17697
64
citations

Stable Cinemetrics : Structured Taxonomy and Evaluation for Professional Video Generation

Agneet Chatterjee, Rahim Entezari, Maksym Zhuravinskyi et al.

NEURIPS 2025arXiv:2509.26555

Stable Virtual Camera: Generative View Synthesis with Diffusion Models

Jensen Zhou, Hang Gao, Vikram Voleti et al.

ICCV 2025arXiv:2503.14489
87
citations

STDD: Spatio-Temporal Dual Diffusion for Video Generation

Shuaizhen Yao, Xiaoya Zhang, Xin Liu et al.

CVPR 2025
2
citations

SteerX: Creating Any Camera-Free 3D and 4D Scenes with Geometric Steering

Byeongjun Park, Hyojun Go, Hyelin Nam et al.

ICCV 2025arXiv:2503.12024
5
citations

SweetTok: Semantic-Aware Spatial-Temporal Tokenizer for Compact Video Discretization

Zhentao Tan, Ben Xue, Jian Jia et al.

ICCV 2025arXiv:2412.10443
6
citations

Synergizing Motion and Appearance: Multi-Scale Compensatory Codebooks for Talking Head Video Generation

Shuling Zhao, Fa-Ting Hong, Xiaoshui Huang et al.

CVPR 2025arXiv:2412.00719
7
citations

Taming Teacher Forcing for Masked Autoregressive Video Generation

Deyu Zhou, Quan Sun, Yuang Peng et al.

CVPR 2025arXiv:2501.12389
20
citations

TASTE-Rob: Advancing Video Generation of Task-Oriented Hand-Object Interaction for Generalizable Robotic Manipulation

Hongxiang Zhao, Xingchen Liu, Mutian Xu et al.

CVPR 2025arXiv:2503.11423
22
citations

Tora: Trajectory-oriented Diffusion Transformer for Video Generation

Zhenghao Zhang, Junchao Liao, Menghao Li et al.

CVPR 2025arXiv:2407.21705
115
citations
PreviousNext