"multimodal joint modeling" Papers
2 papers found
Conference
From Reflection to Perfection: Scaling Inference-Time Optimization for Text-to-Image Diffusion Models via Reflection Tuning
Le Zhuo, Liangbing Zhao, Sayak Paul et al.
ICCV 2025arXiv:2504.16080
32
citations
VideoVLA: Video Generators Can Be Generalizable Robot Manipulators
Yichao Shen, Fangyun Wei, Zhiying Du et al.
NEURIPS 2025arXiv:2512.06963
5
citations