α
Research
Alpha Leak
Conferences
Topics
Top Authors
Rankings
Browse All
EN
中
Home
/
Authors
/
Kevin Qinghong Lin
Kevin Qinghong Lin
16
papers
1,681
total citations
papers (16)
Show-o: One Single Transformer to Unify Multimodal Understanding and Generation
ICLR 2025
arXiv
483
citations
Egocentric Video-Language Pretraining
NEURIPS 2022
arXiv
254
citations
All in One: Exploring Unified Video-Language Pre-Training
CVPR 2023
arXiv
239
citations
UniVTG: Towards Unified Video-Language Temporal Grounding
ICCV 2023
arXiv
195
citations
EgoVLPv2: Egocentric Video-Language Pre-training with Fusion in the Backbone
ICCV 2023
arXiv
138
citations
ShowUI: One Vision-Language-Action Model for GUI Visual Agent
CVPR 2025
arXiv
131
citations
VideoLLM-online: Online Video Large Language Model for Streaming Video
CVPR 2024
arXiv
116
citations
Affordance Grounding From Demonstration Video To Target Image
CVPR 2023
arXiv
45
citations
Too Large; Data Reduction for Vision-Language Pre-Training
ICCV 2023
arXiv
32
citations
MovieBench: A Hierarchical Movie Level Dataset for Long Video Generation
CVPR 2025
arXiv
14
citations
Learning Video Context as Interleaved Multimodal Sequences
ECCV 2024
arXiv
12
citations
Learning Visual Prior via Generative Pre-Training
NEURIPS 2023
arXiv
9
citations
ROICtrl: Boosting Instance Control for Visual Generation
CVPR 2025
arXiv
7
citations
VG-TVP: Multimodal Procedural Planning via Visually Grounded Text-Video Prompting
AAAI 2025
arXiv
3
citations
VLog: Video-Language Models by Generative Retrieval of Narration Vocabulary
CVPR 2025
arXiv
3
citations
Bootstrapping SparseFormers from Vision Foundation Models
CVPR 2024
arXiv
0
citations