α
Research
Alpha Leak
Conferences
Topics
Top Authors
Rankings
Browse All
EN
中
Home
/
Authors
/
Devi Parikh
Devi Parikh
19
papers
2,988
total citations
papers (19)
Make-a-Scene: Scene-Based Text-to-Image Generation with Human Priors
ECCV 2022
arXiv
608
citations
12-in-1: Multi-Task Vision and Language Representation Learning
CVPR 2020
arXiv
500
citations
Long Video Generation with Time-Agnostic VQGAN and Time-Sensitive Transformer
ECCV 2022
arXiv
276
citations
Improving Vision-and-Language Navigation with Image-Text Pairs from the Web
ECCV 2020
arXiv
262
citations
SpaText: Spatio-Textual Representation for Controllable Image Generation
CVPR 2023
arXiv
253
citations
Emu Edit: Precise Image Editing via Recognition and Generation Tasks
CVPR 2024
arXiv
250
citations
KRISP: Integrating Implicit and Symbolic Knowledge for Open-Domain Knowledge-Based VQA
CVPR 2021
arXiv
231
citations
Large-scale Pretraining for Visual Dialog: A Simple State-of-the-Art Baseline
ECCV 2020
arXiv
121
citations
Spatially Aware Multimodal Transformers for TextVQA
ECCV 2020
arXiv
95
citations
Vx2Text: End-to-End Learning of Video-Based Text Generation From Multimodal Inputs
CVPR 2021
arXiv
74
citations
Human-Adversarial Visual Question Answering
NEURIPS 2021
arXiv
69
citations
Seeing the Un-Scene: Learning Amodal Semantic Maps for Room Navigation
ECCV 2020
arXiv
62
citations
Make-An-Animation: Large-Scale Text-conditional 3D Human Motion Generation
ICCV 2023
arXiv
61
citations
Episodic Memory Question Answering
CVPR 2022
arXiv
36
citations
Video Editing via Factorized Diffusion Distillation
ECCV 2024
arXiv
29
citations
MUGEN: A Playground for Video-Audio-Text Multimodal Understanding and GENeration
ECCV 2022
arXiv
27
citations
Dialog without Dialog Data: Learning Visual Dialog Agents from VQA Data
NEURIPS 2020
arXiv
15
citations
SQuINTing at VQA Models: Introspecting VQA Models With Sub-Questions
CVPR 2020
arXiv
14
citations
Contrast and Classify: Training Robust VQA Models
ICCV 2021
arXiv
5
citations