α
Research
Alpha Leak
Conferences
Topics
Top Authors
Rankings
Browse All
EN
中
Home
/
Authors
/
Yiyi Zhou
Yiyi Zhou
22
papers
1,111
total citations
papers (22)
Multi-Task Collaborative Network for Joint Referring Expression Comprehension and Segmentation
CVPR 2020
arXiv
352
citations
SeqTR: A Simple Yet Universal Network for Visual Grounding
ECCV 2022
arXiv
212
citations
Cheap and Quick: Efficient Vision-Language Instruction Tuning for Large Language Models
NEURIPS 2023
arXiv
134
citations
Feast Your Eyes: Mixture-of-Resolution Adaptation for Multimodal Large Language Models
ICLR 2025
arXiv
102
citations
Make Sharpness-Aware Minimization Stronger: A Sparsified Perturbation Approach
NEURIPS 2022
arXiv
89
citations
Active Teacher for Semi-Supervised Object Detection
CVPR 2022
arXiv
77
citations
Fit and Prune: Fast and Training-free Visual Token Pruning for Multi-modal Large Language Models
AAAI 2025
arXiv
64
citations
Towards Efficient Diffusion-Based Image Editing with Instant Attention Masks
AAAI 2024
arXiv
20
citations
What Kind of Visual Tokens Do We Need? Training-Free Visual Token Pruning for Multi-Modal Large Language Models from the Perspective of Graph
AAAI 2025
arXiv
20
citations
Parameter and Computation Efficient Transfer Learning for Vision-Language Pre-trained Models
NEURIPS 2023
arXiv
11
citations
Fast Text-to-3D-Aware Face Generation and Manipulation via Direct Cross-modal Mapping and Geometric Regularization
ICML 2024
arXiv
7
citations
Routing Experts: Learning to Route Dynamic Experts in Existing Multi-modal Large Language Models
ICLR 2025
7
citations
PixelFolder: An Efficient Progressive Pixel Synthesis Network for Image Generation
ECCV 2022
arXiv
6
citations
Accelerating Multimodal Large Language Models via Dynamic Visual-Token Exit and the Empirical Findings
NEURIPS 2025
arXiv
5
citations
FlashSloth : Lightning Multimodal Large Language Models via Embedded Visual Compression
CVPR 2025
arXiv
4
citations
SVFR: A Unified Framework for Generalized Video Face Restoration
CVPR 2025
arXiv
1
citations
DIFNet: Boosting Visual Information Flow for Image Captioning
CVPR 2022
0
citations
TRAR: Routing the Attention Spans in Transformer for Visual Question Answering
ICCV 2021
0
citations
RefCLIP: A Universal Teacher for Weakly Supervised Referring Expression Comprehension
CVPR 2023
0
citations
RefTeacher: A Strong Baseline for Semi-Supervised Referring Expression Comprehension
CVPR 2023
0
citations
RSTNet: Captioning With Adaptive Attention on Visual and Non-Visual Words
CVPR 2021
0
citations
DViN: Dynamic Visual Routing Network for Weakly Supervised Referring Expression Comprehension
CVPR 2025
0
citations