α
Research
Alpha Leak
Conferences
Topics
Top Authors
Rankings
Browse All
EN
中
Home
/
Authors
/
Yuankai Qi
Yuankai Qi
24
papers
1,387
total citations
papers (24)
REVERIE: Remote Embodied Visual Referring Expression in Real Indoor Environments
CVPR 2020
arXiv
433
citations
Language and Visual Entity Relationship Graph for Agent Navigation
NEURIPS 2020
arXiv
149
citations
Object-and-Action Aware Model for Visual Language Navigation
ECCV 2020
arXiv
128
citations
Exploiting Completeness and Uncertainty of Pseudo Labels for Weakly Supervised Video Anomaly Detection
CVPR 2023
arXiv
102
citations
AerialVLN: Vision-and-Language Navigation for UAVs
ICCV 2023
arXiv
99
citations
HOP: History-and-Order Aware Pre-Training for Vision-and-Language Navigation
CVPR 2022
arXiv
93
citations
Hierarchical Modular Network for Video Captioning
CVPR 2022
arXiv
92
citations
The Road To Know-Where: An Object-and-Room Informed Sequential BERT for Indoor Vision-Language Navigation
ICCV 2021
arXiv
80
citations
March in Chat: Interactive Prompting for Remote Embodied Referring Expression
ICCV 2023
arXiv
47
citations
Learning To Dub Movies via Hierarchical Prosody Models
CVPR 2023
arXiv
40
citations
V2C: Visual Voice Cloning
CVPR 2022
arXiv
33
citations
Decomposing Disease Descriptions for Enhanced Pathology Detection: A Multi-Aspect Vision-Language Pre-training Framework
CVPR 2024
arXiv
29
citations
Augmented Commonsense Knowledge for Remote Object Grounding
AAAI 2024
arXiv
19
citations
EmoDubber: Towards High Quality and Emotion Controllable Movie Dubbing
CVPR 2025
arXiv
16
citations
Generating Content for HDR Deghosting from Frequency View
CVPR 2024
arXiv
16
citations
Prosody-Enhanced Acoustic Pre-training and Acoustic-Disentangled Prosody Adapting for Movie Dubbing
CVPR 2025
arXiv
5
citations
Seeing the Trees for the Forest: Rethinking Weakly-Supervised Medical Visual Grounding
ICCV 2025
arXiv
2
citations
Separation of Powers: On Segregating Knowledge from Observation in LLM-enabled Knowledge-based Visual Question Answering
CVPR 2025
1
citations
Incomplete Multi-View Multi-Label Classification via Diffusion-Guided Redundancy Removal
AAAI 2025
1
citations
Medusa: A Multi-Scale High-order Contrastive Dual-Diffusion Approach for Multi-View Clustering
CVPR 2025
1
citations
Visual and Semantic Prompt Collaboration for Generalized Zero-Shot Learning
CVPR 2025
arXiv
1
citations
Weakly Supervised Video Individual Counting
CVPR 2024
0
citations
VLN BERT: A Recurrent Vision-and-Language BERT for Navigation
CVPR 2021
0
citations
Generating Synthetic Data for Unsupervised Federated Learning of Cross-Modal Retrieval
AAAI 2025
0
citations