α
Research
Alpha Leak
Conferences
Topics
Top Authors
Rankings
Browse All
EN
中
Home
/
Authors
/
Tae-Hyun Oh
Tae-Hyun Oh
25
papers
885
total citations
papers (25)
Listen to Look: Action Recognition by Previewing Audio
CVPR 2020
arXiv
285
citations
Cross-Attention of Disentangled Modalities for 3D Human Mesh Recovery with Transformers
ECCV 2022
arXiv
147
citations
Paint-it: Text-to-Texture Synthesis via Deep Convolutional Texture Map Optimization and Physically-Based Rendering
CVPR 2024
arXiv
78
citations
Sound to Visual Scene Generation by Audio-to-Visual Latent Alignment
CVPR 2023
arXiv
55
citations
CLIP-Actor: Text-Driven Recommendation and Stylization for Animating Human Meshes
ECCV 2022
arXiv
54
citations
HDR-Plenoxels: Self-Calibrating High Dynamic Range Radiance Fields
ECCV 2022
arXiv
48
citations
Scratching Visual Transformer's Back with Uniform Attention
ICCV 2023
arXiv
37
citations
Monocular Reconstruction of Neural Face Reflectance Fields
CVPR 2021
arXiv
32
citations
Sound Source Localization is All about Cross-Modal Alignment
ICCV 2023
arXiv
31
citations
Noise Map Guidance: Inversion with Spatial Context for Real Image Editing
ICLR 2024
arXiv
25
citations
Dr. Splat: Directly Referring 3D Gaussian Splatting via Direct Language Embedding Registration
CVPR 2025
arXiv
24
citations
TextManiA: Enriching Visual Feature by Text-driven Manifold Augmentation
ICCV 2023
arXiv
12
citations
BEAF: Observing BEfore-AFter Changes to Evaluate Hallucination in Vision-language Models
ECCV 2024
arXiv
10
citations
DisCoRD: Discrete Tokens to Continuous Motion via Rectified Flow Decoding
ICCV 2025
arXiv
7
citations
FPRF: Feed-Forward Photorealistic Style Transfer of Large-Scale 3D Neural Radiance Fields
AAAI 2024
arXiv
7
citations
Perceptually Accurate 3D Talking Head Generation: New Definitions, Speech-Mesh Representation, and Evaluation Metrics
CVPR 2025
arXiv
7
citations
VoiceCraft-Dub: Automated Video Dubbing with Neural Codec Language Models
ICCV 2025
arXiv
6
citations
Robust 3D Shape Reconstruction in Zero-Shot from a Single Image in the Wild
CVPR 2025
arXiv
6
citations
JointDiT: Enhancing RGB-Depth Joint Modeling with Diffusion Transformers
ICCV 2025
arXiv
4
citations
SoundBrush: Sound as a Brush for Visual Scene Editing
AAAI 2025
arXiv
3
citations
Zero-shot Depth Completion via Test-time Alignment with Affine-invariant Depth Prior
AAAI 2025
arXiv
3
citations
VSC: Visual Search Compositional Text-to-Image Diffusion Model
ICCV 2025
arXiv
2
citations
Learning-based Axial Video Motion Magnification
ECCV 2024
arXiv
2
citations
CDS: Cross-Domain Self-Supervised Pre-Training
ICCV 2021
0
citations
Distilling Global and Local Logits With Densely Connected Relations
ICCV 2021
0
citations