"text-to-video models" Papers
12 papers found
Conference
ARLON: Boosting Diffusion Transformers with Autoregressive Models for Long Video Generation
Zongyi Li, Shujie HU, Shujie LIU et al.
ICLR 2025oralarXiv:2410.20502
28
citations
FlowMo: Variance-Based Flow Guidance for Coherent Motion in Video Generation
Ariel Shaulov, Itay Hazan, Lior Wolf et al.
NEURIPS 2025oralarXiv:2506.01144
8
citations
K-Sort Arena: Efficient and Reliable Benchmarking for Generative Models via K-wise Human Preferences
Zhikai Li, Xuewen Liu, Dongrong Joe Fu et al.
CVPR 2025arXiv:2408.14468
10
citations
PanoWan: Lifting Diffusion Video Generation Models to 360$^\circ$ with Latitude/Longitude-aware Mechanisms
Yifei Xia, Shuchen Weng, Siqi Yang et al.
NEURIPS 2025
5
citations
Prompt-A-Video: Prompt Your Video Diffusion Model via Preference-Aligned LLM
Yatai Ji, Jiacheng Zhang, Jie Wu et al.
ICCV 2025arXiv:2412.15156
10
citations
ReNeg: Learning Negative Embedding with Reward Guidance
Xiaomin Li, yixuan liu, Takashi Isobe et al.
CVPR 2025highlightarXiv:2412.19637
6
citations
STAR: Spatial-Temporal Augmentation with Text-to-Video Models for Real-World Video Super-Resolution
Rui Xie, Yinhong Liu, Penghao Zhou et al.
ICCV 2025arXiv:2501.02976
27
citations
VideoPhy: Evaluating Physical Commonsense for Video Generation
Hritik Bansal, Zongyu Lin, Tianyi Xie et al.
ICLR 2025arXiv:2406.03520
106
citations
VIRES: Video Instance Repainting via Sketch and Text Guided Generation
Shuchen Weng, Haojie Zheng, Peixuan Zhang et al.
CVPR 2025arXiv:2411.16199
1
citations
Brain Netflix: Scaling Data to Reconstruct Videos from Brain Signals
Camilo Fosco, Benjamin Lahner, Bowen Pan et al.
ECCV 2024
9
citations
MEVG : Multi-event Video Generation with Text-to-Video Models
Gyeongrok Oh, Jaehwan Jeong, Sieun Kim et al.
ECCV 2024arXiv:2312.04086
37
citations
RoboDreamer: Learning Compositional World Models for Robot Imagination
Siyuan Zhou, Yilun Du, Jiaben Chen et al.
ICML 2024arXiv:2404.12377
107
citations