Oral "vision language models" Papers
3 papers found
Conference
DenseDPO: Fine-Grained Temporal Preference Optimization for Video Diffusion Models
Ziyi Wu, Anil Kag, Ivan Skorokhodov et al.
NEURIPS 2025oralarXiv:2506.03517
14
citations
MEgoHand: Multimodal Egocentric Hand-Object Interaction Motion Generation
Bohan Zhou, Yi Zhan, Zhongbin Zhang et al.
NEURIPS 2025oralarXiv:2505.16602
3
citations
Vision Language Models are In-Context Value Learners
Yecheng Jason Ma, Joey Hejna, Chuyuan Fu et al.
ICLR 2025oralarXiv:2411.04549
49
citations