"video-and-language understanding" Papers
2 papers found
Conference
MuLTI: Efficient Video-and-Language Understanding with Text-Guided MultiWay-Sampler and Multiple Choice Modeling
Jiaqi Xu, Bo Liu, Yunkuo Chen et al.
AAAI 2024paperarXiv:2303.05707
2
citations
Towards Surveillance Video-and-Language Understanding: New Dataset Baselines and Challenges
Tongtong Yuan, Xuange Zhang, Kun Liu et al.
CVPR 2024arXiv:2309.13925
45
citations