α
Research
Alpha Leak
Conferences
Topics
Top Authors
Rankings
Browse All
EN
中
Home
/
Authors
/
Wenwei Zhang
Wenwei Zhang
24
papers
2,665
total citations
papers (24)
K-Net: Towards Unified Image Segmentation
NEURIPS 2021
arXiv
448
citations
Seesaw Loss for Long-Tailed Instance Segmentation
CVPR 2021
arXiv
274
citations
Dense Distinct Query for End-to-End Object Detection
CVPR 2023
arXiv
223
citations
Aligning Bag of Regions for Open-Vocabulary Object Detection
CVPR 2023
arXiv
156
citations
Side-Aware Boundary Localization for More Precise Object Detection
ECCV 2020
arXiv
153
citations
Segment Any Point Cloud Sequences by Distilling Vision Foundation Models
NEURIPS 2023
arXiv
140
citations
Robo3D: Towards Robust and Reliable 3D Perception against Corruptions
ICCV 2023
arXiv
134
citations
EmbodiedScan: A Holistic Multi-Modal 3D Perception Suite Towards Embodied AI
CVPR 2024
arXiv
130
citations
LLaVA-3D: A Simple yet Effective Pathway to Empowering LMMs with 3D Capabilities
ICCV 2025
127
citations
EcoNAS: Finding Proxies for Economical Neural Architecture Search
CVPR 2020
arXiv
125
citations
Video K-Net: A Simple, Strong, and Unified Baseline for Video Segmentation
CVPR 2022
arXiv
111
citations
CLIPSelf: Vision Transformer Distills Itself for Open-Vocabulary Dense Prediction
ICLR 2024
arXiv
110
citations
OMG-Seg: Is One Model Good Enough For All Segmentation?
CVPR 2024
arXiv
108
citations
Unified Human-Scene Interaction via Prompted Chain-of-Contacts
ICLR 2024
arXiv
101
citations
Are VLMs Ready for Autonomous Driving? An Empirical Study from the Reliability, Data and Metric Perspectives
ICCV 2025
arXiv
74
citations
Can AI Assistants Know What They Don't Know?
ICML 2024
arXiv
43
citations
MV-JAR: Masked Voxel Jigsaw and Reconstruction for LiDAR-Based Self-Supervised Pre-Training
CVPR 2023
arXiv
37
citations
Harmonizing Visual Representations for Unified Multimodal Understanding and Generation
ICCV 2025
arXiv
37
citations
OV-PARTS: Towards Open-Vocabulary Part Segmentation
NEURIPS 2023
arXiv
36
citations
Tube-Link: A Flexible Cross Tube Framework for Universal Video Segmentation
ICCV 2023
arXiv
28
citations
CLIM: Contrastive Language-Image Mosaic for Region Representation
AAAI 2024
arXiv
25
citations
F-LMM: Grounding Frozen Large Multimodal Models
CVPR 2025
arXiv
22
citations
Dense Siamese Network for Dense Unsupervised Learning
ECCV 2022
arXiv
16
citations
Rethinking Verification for LLM Code Generation: From Generation to Testing
NEURIPS 2025
arXiv
7
citations