α
Research
Alpha Leak
Conferences
Topics
Top Authors
Rankings
Browse All
EN
中
Home
/
Authors
/
Rohit Girdhar
Rohit Girdhar
1
Affiliations
Affiliations
Facebook AI Research
20
papers
9,559
total citations
papers (20)
Masked-Attention Mask Transformer for Universal Image Segmentation
CVPR 2022
arXiv
3,418
citations
Ego4D: Around the World in 3,000 Hours of Egocentric Video
CVPR 2022
arXiv
1,511
citations
ImageBind: One Embedding Space To Bind Them All
CVPR 2023
arXiv
1,348
citations
Detecting Twenty-Thousand Classes Using Image-Level Supervision
ECCV 2022
arXiv
768
citations
An End-to-End Transformer Model for 3D Object Detection
ICCV 2021
arXiv
580
citations
Self-Supervised Pretraining of 3D Features on Any Point-Cloud
ICCV 2021
arXiv
320
citations
Omnivore: A Single Model for Many Visual Modalities
CVPR 2022
arXiv
287
citations
Anticipative Video Transformer
ICCV 2021
arXiv
254
citations
Cut and Learn for Unsupervised Object Detection and Instance Segmentation
CVPR 2023
arXiv
244
citations
Learning Video Representations From Large Language Models
CVPR 2023
arXiv
234
citations
InstanceDiffusion: Instance-level Control for Image Generation
CVPR 2024
arXiv
180
citations
OmniMAE: Single Model Masked Pretraining on Images and Videos
CVPR 2023
arXiv
120
citations
The Effectiveness of MAE Pre-Pretraining for Billion-Scale Pretraining
ICCV 2023
arXiv
86
citations
HierVL: Learning Hierarchical Video-Language Embeddings
CVPR 2023
arXiv
74
citations
3D Spatial Recognition Without Spatially Labeled 3D
CVPR 2021
arXiv
66
citations
VideoCutLER: Surprisingly Simple Unsupervised Video Instance Segmentation
CVPR 2024
arXiv
37
citations
SoundingActions: Learning How Actions Sound from Narrated Egocentric Videos
CVPR 2024
arXiv
12
citations
MotiF: Making Text Count in Image Animation with Motion Focal Loss
CVPR 2025
arXiv
7
citations
Generating Illustrated Instructions
CVPR 2024
arXiv
7
citations
LLMs can see and hear without any training
ICML 2025
arXiv
6
citations