α
Research
Alpha Leak
Conferences
Topics
Top Authors
Rankings
Browse All
EN
中
Home
/
Authors
/
Yukang Chen
Yukang Chen
25
papers
2,911
total citations
papers (25)
LISA: Reasoning Segmentation via Large Language Model
CVPR 2024
arXiv
742
citations
VoxelNeXt: Fully Sparse VoxelNet for 3D Object Detection and Tracking
CVPR 2023
arXiv
388
citations
Focal Sparse Convolutional Networks for 3D Object Detection
CVPR 2022
arXiv
280
citations
Spherical Transformer for LiDAR-Based 3D Recognition
CVPR 2023
arXiv
208
citations
Learning Dynamic Routing for Semantic Segmentation
CVPR 2020
arXiv
184
citations
NVILA: Efficient Frontier Visual Language Models
CVPR 2025
arXiv
157
citations
LargeKernel3D: Scaling Up Kernels in 3D Sparse CNNs
CVPR 2023
arXiv
125
citations
VisionZip: Longer is Better but Not Necessary in Vision Language Models
CVPR 2025
arXiv
123
citations
Voxel Field Fusion for 3D Object Detection
CVPR 2022
arXiv
103
citations
FocalFormer3D: Focusing on Hard Instance for 3D Object Detection
ICCV 2023
arXiv
92
citations
Multi-Scale Aligned Distillation for Low-Resolution Detection
CVPR 2021
arXiv
65
citations
Data Pruning via Moving-one-Sample-out
NEURIPS 2023
arXiv
64
citations
OA-CNNs: Omni-Adaptive Sparse CNNs for 3D Semantic Segmentation
CVPR 2024
arXiv
61
citations
Scale-Aware Automatic Augmentation for Object Detection
CVPR 2021
arXiv
58
citations
IST-Net: Prior-Free Category-Level Pose Estimation with Implicit Space Transformation
ICCV 2023
arXiv
54
citations
Mask-Attention-Free Transformer for 3D Instance Segmentation
ICCV 2023
arXiv
48
citations
Spatial Pruned Sparse Convolution for Efficient 3D Object Detection
NEURIPS 2022
arXiv
47
citations
WorldModelBench: Judging Video Generation Models As World Models
NEURIPS 2025
arXiv
37
citations
Denoising Diffusion Step-aware Models
ICLR 2024
arXiv
25
citations
MindOmni: Unleashing Reasoning Generation in Vision Language Models with RGPO
NEURIPS 2025
arXiv
20
citations
Lyra: An Efficient and Speech-Centric Framework for Omni-Cognition
ICCV 2025
arXiv
19
citations
SaCo Loss: Sample-wise Affinity Consistency for Vision-Language Pre-training
CVPR 2024
10
citations
SparseVILA: Decoupling Visual Sparsity for Efficient VLM Inference
ICCV 2025
arXiv
1
citations
Mixture-of-Scores: Robust Image-Text Data Valuation via Three Lines of Code
ICCV 2025
0
citations
Low-Rank Approximation for Sparse Attention in Multi-Modal LLMs
CVPR 2024
0
citations