α
Research
Alpha Leak
Conferences
Topics
Top Authors
Rankings
Browse All
EN
中
Home
/
Authors
/
Xiaohan Wang
Xiaohan Wang
26
papers
846
total citations
papers (26)
T2VLAD: Global-Local Sequence Alignment for Text-Video Retrieval
CVPR 2021
arXiv
215
citations
Bird's-Eye-View Scene Graph for Vision-Language Navigation
ICCV 2023
arXiv
88
citations
Bidirectional Cross-Modal Knowledge Exploration for Video Recognition With Pre-Trained Vision-Language Models
CVPR 2023
arXiv
80
citations
Global-to-Local Modeling for Video-Based 3D Human Pose and Shape Estimation
CVPR 2023
arXiv
61
citations
LANA: A Language-Capable Navigator for Instruction Following and Generation
CVPR 2023
arXiv
57
citations
Apollo: An Exploration of Video Understanding in Large Multimodal Models
CVPR 2025
arXiv
55
citations
Describing Differences in Image Sets with Natural Language
CVPR 2024
arXiv
52
citations
Clustering based Point Cloud Representation Learning for 3D Analysis
ICCV 2023
arXiv
45
citations
DGL: Dynamic Global-Local Prompt Tuning for Text-Video Retrieval
AAAI 2024
arXiv
45
citations
Action Sensitivity Learning for Temporal Action Localization
ICCV 2023
arXiv
42
citations
BIOMEDICA: An Open Biomedical Image-Caption Archive, Dataset, and Vision-Language Models Derived from Scientific Literature
CVPR 2025
arXiv
26
citations
Automated Generation of Challenging Multiple-Choice Questions for Vision Language Model Evaluation
CVPR 2025
arXiv
23
citations
Feather the Throttle: Revisiting Visual Token Pruning for Vision-Language Model Acceleration
ICCV 2025
arXiv
21
citations
JOTR: 3D Joint Contrastive Learning with Transformers for Occluded Human Mesh Recovery
ICCV 2023
arXiv
20
citations
PR-RRN: Pairwise-Regularized Residual-Recursive Networks for Non-Rigid Structure-From-Motion
ICCV 2021
arXiv
11
citations
Cross-Sentence Gloss Consistency for Continuous Sign Language Recognition
AAAI 2024
5
citations
CaMP: Causal Multi-policy Planning for Interactive Navigation in Multi-room Scenes
NEURIPS 2023
0
citations
A Category Agnostic Model for Visual Rearrangment
CVPR 2024
0
citations
MAAL: Multimodality-Aware Autoencoder-Based Affordance Learning for 3D Articulated Objects
ICCV 2023
0
citations
Interpretable3D: An Ad
AAAI 2024
0
citations
Imagine Before Go: Self-Supervised Generative Map for Object Goal Navigation
CVPR 2024
0
citations
An Interactive Navigation Method with Effect-oriented Affordance
CVPR 2024
0
citations
Interactive Prototype Learning for Egocentric Action Recognition
ICCV 2021
0
citations
Large-Scale Video Panoptic Segmentation in the Wild: A Benchmark
CVPR 2022
0
citations
A Simple Episodic Linear Probe Improves Visual Recognition in the Wild
CVPR 2022
0
citations
Adversarially Masking Synthetic To Mimic Real: Adaptive Noise Injection for Point Cloud Segmentation Adaptation
CVPR 2023
0
citations