α
Research
Alpha Leak
Conferences
Topics
Top Authors
Rankings
Browse All
EN
中
Home
/
Authors
/
Jiasen Lu
Jiasen Lu
1
Affiliations
Affiliations
Allen Institute of AI
11
papers
1,326
total citations
papers (11)
12-in-1: Multi-Task Vision and Language Representation Learning
CVPR 2020
arXiv
500
citations
Unified-IO 2: Scaling Autoregressive Multimodal Models with Vision Language Audio and Action
CVPR 2024
arXiv
280
citations
MERLOT Reserve: Neural Script Knowledge Through Vision and Language and Sound
CVPR 2022
arXiv
241
citations
Molmo and PixMo: Open Weights and Open Data for State-of-the-Art Vision-Language Models
CVPR 2025
arXiv
111
citations
Spatially Aware Multimodal Transformers for TextVQA
ECCV 2020
arXiv
95
citations
One Diffusion to Generate Them All
CVPR 2025
arXiv
34
citations
STIV: Scalable Text and Image Conditioned Video Generation
ICCV 2025
arXiv
21
citations
SlowFast-LLaVA-1.5: A Family of Token-Efficient Video Large Language Models for Long-Form Video Understanding
COLM 2025
arXiv
17
citations
Dialog without Dialog Data: Learning Visual Dialog Agents from VQA Data
NEURIPS 2020
arXiv
15
citations
MMEgo: Towards Building Egocentric Multimodal LLMs for Video QA
ICLR 2025
12
citations
Container: Context Aggregation Networks
NEURIPS 2021
0
citations