Xu Tan

Affiliations

Microsoft Research Asia

papers

3,792

total citations

papers (18)

MPNet: Masked and Permuted Pre-training for Language Understanding

NEURIPS 2020arXiv

1,506

citations

HuggingGPT: Solving AI Tasks with ChatGPT and its Friends in Hugging Face

NEURIPS 2023arXiv

1,267

citations

NaturalSpeech 3: Zero-Shot Speech Synthesis with Factorized Codec and Diffusion Models

ICML 2024arXiv

306

citations

Semi-Supervised Neural Architecture Search

NEURIPS 2020arXiv

citations

AUDIT: Audio Editing by Following Instructions with Latent Diffusion Models

NEURIPS 2023arXiv

citations

Museformer: Transformer with Fine- and Coarse-Grained Attention for Music Generation

NEURIPS 2022arXiv

citations

BinauralGrad: A Two-Stage Conditional Diffusion Probabilistic Model for Binaural Audio Synthesis

NEURIPS 2022arXiv

citations

FastCorrect: Fast Error Correction with Edit Alignment for Automatic Speech Recognition

NEURIPS 2021arXiv

citations

Codec Does Matter: Exploring the Semantic Shortcoming of Codec for Audio Language Model

AAAI 2025arXiv

citations

PromptTTS 2: Describing and Generating Voices with Text Prompt

ICLR 2024arXiv

citations

GAIA: Zero-shot Talking Avatar Generation

ICLR 2024arXiv

citations

VidMuse: A Simple Video-to-Music Generation Framework with Long-Short-Term Modeling

CVPR 2025arXiv

citations

HiFace: High-Fidelity 3D Face Reconstruction by Learning Static and Dynamic Details

ICCV 2023arXiv

citations

InstructAvatar: Text-Guided Emotion and Motion Control for Avatar Generation

AAAI 2025arXiv

citations

Transcormer: Transformer for Sentence Scoring with Sliding Language Modeling

NEURIPS 2022arXiv

citations

The Best of Both Worlds: Integrating Language Models and Diffusion Models for Video Generation

ICCV 2025arXiv

citations

UniAudio: Towards Universal Audio Generation with Large Language Models

ICML 2024

citations

Speech-T: Transducer for Text to Speech and Beyond

NEURIPS 2021

citations