α
Research
Alpha Leak
Conferences
Topics
Top Authors
Rankings
Browse All
EN
中
Home
/
Authors
/
Ziyu Guo
Ziyu Guo
20
papers
2,136
total citations
papers (20)
PointCLIP: Point Cloud Understanding by CLIP
CVPR 2022
arXiv
587
citations
Point-M2AE: Multi-scale Masked Autoencoders for Hierarchical Point Cloud Pre-training
NEURIPS 2022
arXiv
355
citations
Personalize Segment Anything Model with One Shot
ICLR 2024
arXiv
301
citations
PointCLIP V2: Prompting CLIP and GPT for Powerful 3D Open-world Learning
ICCV 2023
arXiv
225
citations
MonoDETR: Depth-guided Transformer for Monocular 3D Object Detection
ICCV 2023
arXiv
152
citations
LiDAR-LLM: Exploring the Potential of Large Language Models for 3D LiDAR Understanding
AAAI 2025
arXiv
116
citations
T2I-R1: Reinforcing Image Generation with Collaborative Semantic-level and Token-level CoT
NEURIPS 2025
arXiv
100
citations
MME-CoT: Benchmarking Chain-of-Thought in Large Multimodal Models for Reasoning Quality, Robustness, and Efficiency
ICML 2025
arXiv
94
citations
Referred by Multi-Modality: A Unified Temporal Transformer for Video Object Segmentation
AAAI 2024
arXiv
58
citations
UniCTokens: Boosting Personalized Understanding and Generation via Unified Concept Tokens
NEURIPS 2025
arXiv
38
citations
Perceive Anything: Recognize, Explain, Caption, and Segment Anything in Images and Videos
NEURIPS 2025
arXiv
32
citations
No Time to Train: Empowering Non-Parametric Networks for Few-shot 3D Scene Segmentation
CVPR 2024
arXiv
27
citations
Delving into RL for Image Generation with CoT: A Study on DPO vs. GRPO
NEURIPS 2025
arXiv
26
citations
Adaptive Classifier-Free Guidance via Dynamic Low-Confidence Masking
NEURIPS 2025
arXiv
14
citations
MMSearch: Unveiling the Potential of Large Models as Multi-modal Search Engines
ICLR 2025
7
citations
StyleMotif: Multi-Modal Motion Stylization using Style-Content Cross Fusion
ICCV 2025
arXiv
3
citations
MM-Mixing: Multi-Modal Mixing Alignment for 3D Understanding
AAAI 2025
arXiv
1
citations
EchoTraffic: Enhancing Traffic Anomaly Understanding with Audio-Visual Insights
CVPR 2025
0
citations
Let's Verify and Reinforce Image Generation Step by Step
CVPR 2025
0
citations
Less is More: Improving Motion Diffusion Models with Sparse Keyframes
ICCV 2025
arXiv
0
citations