"temporal coherence" Papers
34 papers found
Conference
AnimateAnything: Consistent and Controllable Animation for Video Generation
guojun lei, Chi Wang, Rong Zhang et al.
AnyPortal: Zero-Shot Consistent Video Background Replacement
Wenshuo Gao, Xicheng Lan, Shuai Yang
AR-Diffusion: Asynchronous Video Generation with Auto-Regressive Diffusion
Mingzhen Sun, Weining Wang, Li et al.
Beyond One Shot, Beyond One Perspective: Cross-View and Long-Horizon Distillation for Better LiDAR Representations
Xiang Xu, Lingdong Kong, Song Wang et al.
CoMotion: Concurrent Multi-person 3D Motion
Alejandro Newell, Peiyun Hu, Lahav Lipson et al.
Diffusion-based 3D Hand Motion Recovery with Intuitive Physics
Yufei Zhang, Zijun Cui, Jeffrey Kephart et al.
DisCo: Towards Distinct and Coherent Visual Encapsulation in Video MLLMs
JIAHE ZHAO, rongkun Zheng, Yi Wang et al.
Discrete Prior-Based Temporal-Coherent Content Prediction for Blind Face Video Restoration
Lianxin Xie, Bingbing Zheng, Wen Xue et al.
DiTaiListener: Controllable High Fidelity Listener Video Generation with Diffusion
Maksim Siniukov, Di Chang, Minh Tran et al.
Efficient Long Video Tokenization via Coordinate-based Patch Reconstruction
Huiwon Jang, Sihyun Yu, Jinwoo Shin et al.
FADE: Frequency-Aware Diffusion Model Factorization for Video Editing
Yixuan Zhu, Haolin Wang, Shilin Ma et al.
Frame In-N-Out: Unbounded Controllable Image-to-Video Generation
Boyang Wang, Xuweiyi Chen, Matheus Gadelha et al.
GAS: Generative Avatar Synthesis from a Single Image
Yixing Lu, Junting Dong, YoungJoong Kwon et al.
Generative Inbetweening: Adapting Image-to-Video Models for Keyframe Interpolation
Xiaojuan Wang, Boyang Zhou, Brian Curless et al.
GeometryCrafter: Consistent Geometry Estimation for Open-world Videos with Diffusion Priors
Tian-Xing Xu, Xiangjun Gao, Wenbo Hu et al.
GLCF: A Global-Local Multimodal Coherence Analysis Framework for Talking Face Generation Detection
Xiaocan Chen, Qilin Yin, Jiarui Liu et al.
Go-with-the-Flow: Motion-Controllable Video Diffusion Models Using Real-Time Warped Noise
Ryan Burgert, Yuancheng Xu, Wenqi Xian et al.
Improve Temporal Reasoning in Multimodal Large Language Models via Video Contrastive Decoding
Daiqing Qi, Dongliang Guo, Hanzhang Yuan et al.
Mask^2DiT: Dual Mask-based Diffusion Transformer for Multi-Scene Long Video Generation
Tianhao Qi, Jianlong Yuan, Wanquan Feng et al.
NormalCrafter: Learning Temporally Consistent Normals from Video Diffusion Priors
Yanrui Bin, Wenbo Hu, Haoyuan Wang et al.
Smooth Regularization for Efficient Video Recognition
Gil Goldman, Raja Giryes, Mahadev Satyanarayanan
StateSpaceDiffuser: Bringing Long Context to Diffusion World Models
Nedko Savov, Naser Kazemi, Deheng Zhang et al.
SVLTA: Benchmarking Vision-Language Temporal Alignment via Synthetic Video Situation
Hao Du, Bo Wu, Yan Lu et al.
SwiftTry: Fast and Consistent Video Virtual Try-On with Diffusion Models
Hung Nguyen, Quang Qui-Vinh Nguyen, Khoi Nguyen et al.
TC-Light: Temporally Coherent Generative Rendering for Realistic World Transfer
Yang Liu, Chuanchen Luo, Zimo Tang et al.
Through-The-Mask: Mask-based Motion Trajectories for Image-to-Video Generation
Guy Yariv, Yuval Kirstain, Amit Zohar et al.
Track4Gen: Teaching Video Diffusion Models to Track Points Improves Video Generation
Hyeonho Jeong, Chun-Hao P. Huang, Jong Chul Ye et al.
VLM4D: Towards Spatiotemporal Awareness in Vision Language Models
Shijie Zhou, Alexander Vilesov, Xuehai He et al.
DyBluRF: Dynamic Neural Radiance Fields from Blurry Monocular Video
Huiqiang Sun, Xingyi Li, Liao Shen et al.
FaceChain-ImagineID: Freely Crafting High-Fidelity Diverse Talking Faces from Disentangled Audio
Chao Xu, Yang Liu, Jiazheng Xing et al.
FipTR: A Simple yet Effective Transformer Framework for Future Instance Prediction in Autonomous Driving
Xingtai Gui, Tengteng Huang, Haonan Shao et al.
Object-Centric Diffusion for Efficient Video Editing
Kumara Kahatapitiya, Adil Karjauv, Davide Abati et al.
Unrolled Decomposed Unpaired Learning for Controllable Low-Light Video Enhancement
Lingyu Zhu, Wenhan Yang, Baoliang Chen et al.
WildVidFit: Video Virtual Try-On in the Wild via Image-Based Controlled Diffusion Models
Zijian He, Peixin Chen, Guangrun Wang et al.