α
Research
Alpha Leak
Conferences
Topics
Top Authors
Rankings
Browse All
EN
中
Home
/
Authors
/
David Harwath
David Harwath
9
papers
424
total citations
papers (9)
Everything at Once - Multi-Modal Fusion Transformer for Video Retrieval
CVPR 2022
arXiv
157
citations
Multimodal Clustering Networks for Self-Supervised Learning From Unlabeled Videos
ICCV 2021
arXiv
97
citations
Spoken Moments: Learning Joint Audio-Visual Representations From Video Descriptions
CVPR 2021
arXiv
68
citations
BAT: Learning to Reason about Spatial Sounds with Large Language Models
ICML 2024
arXiv
40
citations
SyllableLM: Learning Coarse Semantic Units for Speech Language Models
ICLR 2025
arXiv
23
citations
Action2Sound: Ambient-Aware Generation of Action Sounds from Egocentric Videos
ECCV 2024
arXiv
21
citations
SoundingActions: Learning How Actions Sound from Narrated Egocentric Videos
CVPR 2024
arXiv
12
citations
VoiceCraft-Dub: Automated Video Dubbing with Neural Codec Language Models
ICCV 2025
arXiv
6
citations
Rhapsody: A Dataset for Highlight Detection in Podcasts
COLM 2025
arXiv
0
citations