α
Research
Alpha Leak
Conferences
Topics
Top Authors
Rankings
Browse All
EN
中
Home
/
Authors
/
Yong Man Ro
Yong Man Ro
1
Affiliations
Affiliations
KAIST
23
papers
509
total citations
papers (23)
Video Prediction Recalling Long-Term Motion Context via Memory Alignment Learning
CVPR 2021
arXiv
125
citations
Lip to Speech Synthesis with Visual Context Attentional GAN
NEURIPS 2021
arXiv
67
citations
Watch or Listen: Robust Audio-Visual Speech Recognition With Visual Corruption Modeling and Reliability Scoring
CVPR 2023
arXiv
54
citations
Distilling Robust and Non-Robust Features in Adversarial Examples by Information Bottleneck
NEURIPS 2021
arXiv
52
citations
DiffV2S: Diffusion-Based Video-to-Speech Synthesis with Vision-Guided Speaker Embedding
ICCV 2023
arXiv
32
citations
Speaker-Adaptive Lip Reading with User-Dependent Padding
ECCV 2022
arXiv
29
citations
Lip Reading for Low-resource Languages by Learning and Combining General Speech Knowledge and Language-specific Knowledge
ICCV 2023
arXiv
28
citations
Masking Adversarial Damage: Finding Adversarial Saliency for Robust and Sparse Network
CVPR 2022
arXiv
25
citations
Demystifying Causal Features on Adversarial Examples and Causal Inoculation for Robust Network by Adversarial Instrumental Variable Regression
CVPR 2023
arXiv
24
citations
AV2AV: Direct Audio-Visual Speech to Audio-Visual Speech Translation with Unified Audio-Visual Speech Representation
CVPR 2024
arXiv
16
citations
Causal Mode Multiplexer: A Novel Framework for Unbiased Multispectral Pedestrian Detection
CVPR 2024
arXiv
16
citations
Mitigating Adversarial Vulnerability through Causal Parameter Estimation by Adversarial Double Machine Learning
ICCV 2023
arXiv
11
citations
VisageSynTalk: Unseen Speaker Video-to-Speech Synthesis via Speech-Visage Feature Selection
ECCV 2022
arXiv
8
citations
VLsI: Verbalized Layers-to-Interactions from Large to Small Vision Language Models
CVPR 2025
arXiv
7
citations
Zero-AVSR: Zero-Shot Audio-Visual Speech Recognition with LLMs by Learning Language-Agnostic Speech Representations
ICCV 2025
arXiv
5
citations
Personalized Lip Reading: Adapting to Your Unique Lip Movements with Vision and Language
AAAI 2025
arXiv
5
citations
SALOVA: Segment-Augmented Long Video Assistant for Targeted Retrieval and Routing in Long-Form Video Analysis
CVPR 2025
arXiv
5
citations
Multi-Modality Associative Bridging Through Memory: Speech Sound Recollected From Face Video
ICCV 2021
0
citations
Audio-Visual Mismatch-Aware Video Retrieval via Association and Adjustment
ECCV 2022
0
citations
Weakly Paired Associative Learning for Sound and Image Representations via Bimodal Associative Memory
CVPR 2022
0
citations
Robust Small-Scale Pedestrian Detection With Cued Recall via Memory Learning
ICCV 2021
0
citations
SACA Net: Cybersickness Assessment of Individual Viewers for VR Content via Graph-based Symptom Relation Embedding
ECCV 2020
0
citations
Structure Boundary Preserving Segmentation for Medical Image With Ambiguous Boundary
CVPR 2020
0
citations