α
Research
Alpha Leak
Conferences
Topics
Top Authors
Rankings
Browse All
EN
中
Home
/
Authors
/
Di Hu
Di Hu
18
papers
1,354
total citations
papers (18)
Balanced Multimodal Learning via On-the-Fly Gradient Modulation
CVPR 2022
arXiv
350
citations
Learning To Answer Questions in Dynamic Audio-Visual Scenarios
CVPR 2022
arXiv
221
citations
Multiple Sound Sources Localization from Coarse to Fine
ECCV 2020
arXiv
179
citations
Discriminative Sounding Objects Localization via Self-supervised Audiovisual Matching
NEURIPS 2020
arXiv
149
citations
Unsupervised Multi-Source Domain Adaptation for Person Re-Identification
CVPR 2021
arXiv
93
citations
Cyclic Co-Learning of Sounding Object Visual Grounding and Sound Separation
CVPR 2021
arXiv
92
citations
MMPareto: Boosting Multimodal Learning with Innocent Unimodal Assistance
ICML 2024
arXiv
64
citations
Enhancing Multimodal Cooperation via Sample-level Modality Valuation
CVPR 2024
arXiv
52
citations
Diagnosing and Re-learning for Balanced Multimodal Learning
ECCV 2024
arXiv
38
citations
Prompting Segmentation with Sound Is Generalizable Audio-Visual Source Localizer
AAAI 2024
arXiv
38
citations
Ref-AVS: Refer and Segment Objects in Audio-Visual Scenes
ECCV 2024
arXiv
25
citations
Crab: A Unified Audio-Visual Scene Understanding Model with Explicit Cooperation
CVPR 2025
arXiv
12
citations
Phoenix: A Motion-based Self-Reflection Framework for Fine-grained Robotic Action Correction
CVPR 2025
arXiv
11
citations
Towards Inadequately Pre-trained Models in Transfer Learning
ICCV 2023
arXiv
9
citations
Adaptive Unimodal Regulation for Balanced Multimodal Information Acquisition
CVPR 2025
arXiv
8
citations
Patch Matters: Training-free Fine-grained Image Caption Enhancement via Local Perception
CVPR 2025
arXiv
6
citations
Cross-Task Transfer for Geotagged Audiovisual Aerial Scene Recognition
ECCV 2020
arXiv
6
citations
MokA: Multimodal Low-Rank Adaptation for MLLMs
NEURIPS 2025
arXiv
1
citations