"text-to-video models" Papers

12 papers found

ARLON: Boosting Diffusion Transformers with Autoregressive Models for Long Video Generation

Zongyi Li, Shujie HU, Shujie LIU et al.

ICLR 2025oralarXiv:2410.20502
28
citations

FlowMo: Variance-Based Flow Guidance for Coherent Motion in Video Generation

Ariel Shaulov, Itay Hazan, Lior Wolf et al.

NEURIPS 2025oralarXiv:2506.01144
8
citations

K-Sort Arena: Efficient and Reliable Benchmarking for Generative Models via K-wise Human Preferences

Zhikai Li, Xuewen Liu, Dongrong Joe Fu et al.

CVPR 2025arXiv:2408.14468
10
citations

PanoWan: Lifting Diffusion Video Generation Models to 360$^\circ$ with Latitude/Longitude-aware Mechanisms

Yifei Xia, Shuchen Weng, Siqi Yang et al.

NEURIPS 2025
5
citations

Prompt-A-Video: Prompt Your Video Diffusion Model via Preference-Aligned LLM

Yatai Ji, Jiacheng Zhang, Jie Wu et al.

ICCV 2025arXiv:2412.15156
10
citations

ReNeg: Learning Negative Embedding with Reward Guidance

Xiaomin Li, yixuan liu, Takashi Isobe et al.

CVPR 2025highlightarXiv:2412.19637
6
citations

STAR: Spatial-Temporal Augmentation with Text-to-Video Models for Real-World Video Super-Resolution

Rui Xie, Yinhong Liu, Penghao Zhou et al.

ICCV 2025arXiv:2501.02976
27
citations

VideoPhy: Evaluating Physical Commonsense for Video Generation

Hritik Bansal, Zongyu Lin, Tianyi Xie et al.

ICLR 2025arXiv:2406.03520
106
citations

VIRES: Video Instance Repainting via Sketch and Text Guided Generation

Shuchen Weng, Haojie Zheng, Peixuan Zhang et al.

CVPR 2025arXiv:2411.16199
1
citations

Brain Netflix: Scaling Data to Reconstruct Videos from Brain Signals

Camilo Fosco, Benjamin Lahner, Bowen Pan et al.

ECCV 2024
9
citations

MEVG : Multi-event Video Generation with Text-to-Video Models

Gyeongrok Oh, Jaehwan Jeong, Sieun Kim et al.

ECCV 2024arXiv:2312.04086
37
citations

RoboDreamer: Learning Compositional World Models for Robot Imagination

Siyuan Zhou, Yilun Du, Jiaben Chen et al.

ICML 2024arXiv:2404.12377
107
citations