"temporal coherence" Papers

34 papers found

AnimateAnything: Consistent and Controllable Animation for Video Generation

guojun lei, Chi Wang, Rong Zhang et al.

CVPR 2025arXiv:2411.10836
24
citations

AnyPortal: Zero-Shot Consistent Video Background Replacement

Wenshuo Gao, Xicheng Lan, Shuai Yang

ICCV 2025arXiv:2509.07472
2
citations

AR-Diffusion: Asynchronous Video Generation with Auto-Regressive Diffusion

Mingzhen Sun, Weining Wang, Li et al.

CVPR 2025arXiv:2503.07418
27
citations

Beyond One Shot, Beyond One Perspective: Cross-View and Long-Horizon Distillation for Better LiDAR Representations

Xiang Xu, Lingdong Kong, Song Wang et al.

ICCV 2025arXiv:2507.05260
7
citations

CoMotion: Concurrent Multi-person 3D Motion

Alejandro Newell, Peiyun Hu, Lahav Lipson et al.

ICLR 2025oralarXiv:2504.12186
2
citations

Diffusion-based 3D Hand Motion Recovery with Intuitive Physics

Yufei Zhang, Zijun Cui, Jeffrey Kephart et al.

ICCV 2025arXiv:2508.01835
1
citations

DisCo: Towards Distinct and Coherent Visual Encapsulation in Video MLLMs

JIAHE ZHAO, rongkun Zheng, Yi Wang et al.

ICCV 2025arXiv:2507.10302

Discrete Prior-Based Temporal-Coherent Content Prediction for Blind Face Video Restoration

Lianxin Xie, Bingbing Zheng, Wen Xue et al.

AAAI 2025paperarXiv:2501.09960

DiTaiListener: Controllable High Fidelity Listener Video Generation with Diffusion

Maksim Siniukov, Di Chang, Minh Tran et al.

ICCV 2025arXiv:2504.04010
3
citations

Efficient Long Video Tokenization via Coordinate-based Patch Reconstruction

Huiwon Jang, Sihyun Yu, Jinwoo Shin et al.

CVPR 2025arXiv:2411.14762
4
citations

FADE: Frequency-Aware Diffusion Model Factorization for Video Editing

Yixuan Zhu, Haolin Wang, Shilin Ma et al.

CVPR 2025arXiv:2506.05934
3
citations

Frame In-N-Out: Unbounded Controllable Image-to-Video Generation

Boyang Wang, Xuweiyi Chen, Matheus Gadelha et al.

NEURIPS 2025oralarXiv:2505.21491
5
citations

GAS: Generative Avatar Synthesis from a Single Image

Yixing Lu, Junting Dong, YoungJoong Kwon et al.

ICCV 2025arXiv:2502.06957
9
citations

Generative Inbetweening: Adapting Image-to-Video Models for Keyframe Interpolation

Xiaojuan Wang, Boyang Zhou, Brian Curless et al.

ICLR 2025arXiv:2408.15239
30
citations

GeometryCrafter: Consistent Geometry Estimation for Open-world Videos with Diffusion Priors

Tian-Xing Xu, Xiangjun Gao, Wenbo Hu et al.

ICCV 2025arXiv:2504.01016
20
citations

GLCF: A Global-Local Multimodal Coherence Analysis Framework for Talking Face Generation Detection

Xiaocan Chen, Qilin Yin, Jiarui Liu et al.

AAAI 2025paperarXiv:2412.13656
1
citations

Go-with-the-Flow: Motion-Controllable Video Diffusion Models Using Real-Time Warped Noise

Ryan Burgert, Yuancheng Xu, Wenqi Xian et al.

CVPR 2025arXiv:2501.08331
62
citations

Improve Temporal Reasoning in Multimodal Large Language Models via Video Contrastive Decoding

Daiqing Qi, Dongliang Guo, Hanzhang Yuan et al.

NEURIPS 2025oral

Mask^2DiT: Dual Mask-based Diffusion Transformer for Multi-Scene Long Video Generation

Tianhao Qi, Jianlong Yuan, Wanquan Feng et al.

CVPR 2025
8
citations

NormalCrafter: Learning Temporally Consistent Normals from Video Diffusion Priors

Yanrui Bin, Wenbo Hu, Haoyuan Wang et al.

ICCV 2025arXiv:2504.11427
9
citations

Smooth Regularization for Efficient Video Recognition

Gil Goldman, Raja Giryes, Mahadev Satyanarayanan

NEURIPS 2025oralarXiv:2511.20928

StateSpaceDiffuser: Bringing Long Context to Diffusion World Models

Nedko Savov, Naser Kazemi, Deheng Zhang et al.

NEURIPS 2025oralarXiv:2505.22246
8
citations

SVLTA: Benchmarking Vision-Language Temporal Alignment via Synthetic Video Situation

Hao Du, Bo Wu, Yan Lu et al.

CVPR 2025arXiv:2504.05925
1
citations

SwiftTry: Fast and Consistent Video Virtual Try-On with Diffusion Models

Hung Nguyen, Quang Qui-Vinh Nguyen, Khoi Nguyen et al.

AAAI 2025paperarXiv:2412.10178
7
citations

TC-Light: Temporally Coherent Generative Rendering for Realistic World Transfer

Yang Liu, Chuanchen Luo, Zimo Tang et al.

NEURIPS 2025oralarXiv:2506.18904
5
citations

Through-The-Mask: Mask-based Motion Trajectories for Image-to-Video Generation

Guy Yariv, Yuval Kirstain, Amit Zohar et al.

CVPR 2025arXiv:2501.03059
14
citations

Track4Gen: Teaching Video Diffusion Models to Track Points Improves Video Generation

Hyeonho Jeong, Chun-Hao P. Huang, Jong Chul Ye et al.

CVPR 2025arXiv:2412.06016
33
citations

VLM4D: Towards Spatiotemporal Awareness in Vision Language Models

Shijie Zhou, Alexander Vilesov, Xuehai He et al.

ICCV 2025arXiv:2508.02095
16
citations

DyBluRF: Dynamic Neural Radiance Fields from Blurry Monocular Video

Huiqiang Sun, Xingyi Li, Liao Shen et al.

CVPR 2024arXiv:2403.10103
18
citations

FaceChain-ImagineID: Freely Crafting High-Fidelity Diverse Talking Faces from Disentangled Audio

Chao Xu, Yang Liu, Jiazheng Xing et al.

CVPR 2024arXiv:2403.01901
18
citations

FipTR: A Simple yet Effective Transformer Framework for Future Instance Prediction in Autonomous Driving

Xingtai Gui, Tengteng Huang, Haonan Shao et al.

ECCV 2024arXiv:2404.12867
7
citations

Object-Centric Diffusion for Efficient Video Editing

Kumara Kahatapitiya, Adil Karjauv, Davide Abati et al.

ECCV 2024arXiv:2401.05735
23
citations

Unrolled Decomposed Unpaired Learning for Controllable Low-Light Video Enhancement

Lingyu Zhu, Wenhan Yang, Baoliang Chen et al.

ECCV 2024arXiv:2408.12316
10
citations

WildVidFit: Video Virtual Try-On in the Wild via Image-Based Controlled Diffusion Models

Zijian He, Peixin Chen, Guangrun Wang et al.

ECCV 2024arXiv:2407.10625
21
citations