α
Research
Alpha Leak
Conferences
Topics
Top Authors
Rankings
Browse All
EN
中
Home
/
Authors
/
Kevin Lin
Kevin Lin
2
Affiliations
Affiliations
Microsoft
University of Washington
18
papers
3,646
total citations
papers (18)
MM-Vet: Evaluating Large Multimodal Models for Integrated Capabilities
ICML 2024
arXiv
1,066
citations
End-to-End Human Pose and Mesh Reconstruction with Transformers
CVPR 2021
arXiv
737
citations
Mitigating Hallucination in Large Multi-Modal Models via Robust Instruction Tuning
ICLR 2024
arXiv
422
citations
Mesh Graphormer
ICCV 2021
arXiv
383
citations
SwinBERT: End-to-End Transformers With Sparse Attention for Video Captioning
CVPR 2022
arXiv
309
citations
ReCo: Region-Controlled Text-to-Image Generation
CVPR 2023
arXiv
194
citations
DisCo: Disentangled Control for Realistic Human Dance Generation
CVPR 2024
arXiv
139
citations
LAVENDER: Unifying Video-Language Understanding As Masked Language Modeling
CVPR 2023
arXiv
94
citations
An Empirical Study of End-to-End Video-Language Transformers With Masked Visual Modeling
CVPR 2023
arXiv
83
citations
Equivariant Similarity for Vision-Language Foundation Models
ICCV 2023
arXiv
63
citations
MM-Narrator: Narrating Long-form Videos with Multimodal In-Context Learning
CVPR 2024
arXiv
50
citations
Cross-Modal Representation Learning for Zero-Shot Action Recognition
CVPR 2022
arXiv
31
citations
ImageGen-CoT: Enhancing Text-to-Image In-context Learning with Chain-of-Thought Reasoning
ICCV 2025
arXiv
22
citations
Scaling Inference-Time Search with Vision Value Model for Improved Visual Comprehension
ICCV 2025
arXiv
19
citations
Adaptive Human Matting for Dynamic Videos
CVPR 2023
arXiv
13
citations
BizGen: Advancing Article-level Visual Text Rendering for Infographics Generation
CVPR 2025
arXiv
10
citations
Neural Voting Field for Camera-Space 3D Hand Pose Estimation
CVPR 2023
arXiv
7
citations
LiVOS: Light Video Object Segmentation with Gated Linear Matching
CVPR 2025
arXiv
4
citations