"action recognition" Papers

25 papers found

Conference

AAAI 2025 (3,028)COLM 2025 (418)CVPR 2025 (2,873)ICCV 2025 (2,701)ICLR 2025 (3,827)ICML 2025 (3,340)ISMAR 2025 (229)NEURIPS 2025 (5,858)AAAI 2024 (2,289)CVPR 2024 (2,716)ECCV 2024 (2,387)ICLR 2024 (2,297)ICML 2024 (2,635)

Paper Type

poster (24,624)paper (8,558)oral (1,594)spotlight (1,421)highlight (975)

ALBAR: Adversarial Learning approach to mitigate Biases in Action Recognition

Joseph Fioresi, Ishan Rajendrakumar Dave, Mubarak Shah

ICLR 2025arXiv:2502.00156

citations

EgoAdapt: Adaptive Multisensory Distillation and Policy Learning for Efficient Egocentric Perception

Sanjoy Chowdhury, Subrata Biswas, Sayan Nag et al.

ICCV 2025arXiv:2506.21080

citations

EgoExOR: An Ego-Exo-Centric Operating Room Dataset for Surgical Activity Understanding

Ege Özsoy, Arda Mamur, Felix Tristram et al.

NEURIPS 2025arXiv:2505.24287

citations

Enhancing Temporal Understanding in Video-LLMs through Stacked Temporal Attention in Vision Encoders

Leibniz University Hannover, L3S Research Center Ali Rasekh, Erfan Soula, Omid Daliran et al.

NEURIPS 2025oralarXiv:2510.26027

citations

From Image to Video: An Empirical Study of Diffusion Representations

Pedro Vélez, Luisa Polania Cabrera, Yi Yang et al.

ICCV 2025highlightarXiv:2502.07001

citations

H-MoRe: Learning Human-centric Motion Representation for Action Analysis

Zhanbo Huang, Xiaoming Liu, Yu Kong

CVPR 2025highlightarXiv:2504.10676

citations

Kronecker Mask and Interpretive Prompts are Language-Action Video Learners

Jingyi Yang, Zitong YU, Nixiuming et al.

ICLR 2025oralarXiv:2502.03549

citations

MomentSeeker: A Task-Oriented Benchmark For Long-Video Moment Retrieval

Huaying Yuan, Jian Ni, Zheng Liu et al.

NEURIPS 2025arXiv:2502.12558

citations

OSKAR: Omnimodal Self-supervised Knowledge Abstraction and Representation

Mohamed Abdelfattah, Kaouther Messaoud, Alexandre Alahi

NEURIPS 2025

PASS: Path-selective State Space Model for Event-based Recognition

Jiazhou Zhou, Kanghao Chen, Lei Zhang et al.

NEURIPS 2025oralarXiv:2409.16953

citations

Sound Bridge: Associating Egocentric and Exocentric Videos via Audio Cues

Sihong Huang, Jiaxin Wu, Xiaoyong Wei et al.

CVPR 2025

citations

TAMT: Temporal-Aware Model Tuning for Cross-Domain Few-Shot Action Recognition

yilong wang, Zilin Gao, Qilong Wang et al.

CVPR 2025arXiv:2411.19041

citations

VA-AR: Learning Velocity-Aware Action Representations with Mixture of Window Attention

Jiangning Wei, Lixiong Qin, Bo Yu et al.

AAAI 2025paperarXiv:2503.11004

citations

DailyDVS-200: A Comprehensive Benchmark Dataset for Event-Based Action Recognition

Qi Wang, Zhou Xu, Yuming Lin et al.

ECCV 2024arXiv:2407.05106

citations

Data Collection-free Masked Video Modeling

Yuchi Ishikawa, Masayoshi Kondo, Yoshimitsu Aoki

ECCV 2024arXiv:2409.06665

citations

Disentangled Pre-training for Human-Object Interaction Detection

Zhuolong Li, Xingao Li, Changxing Ding et al.

CVPR 2024arXiv:2404.01725

citations

Generative Model-Based Feature Knowledge Distillation for Action Recognition

Guiqin Wang, Peng Zhao, Yanjiang Shi et al.

AAAI 2024paperarXiv:2312.08644

citations

Koala: Key Frame-Conditioned Long Video-LLM

Reuben Tan, Ximeng Sun, Ping Hu et al.

CVPR 2024highlightarXiv:2404.04346

citations

Mamba-ND: Selective State Space Modeling for Multi-Dimensional Data

Shufan Li, Aditya Grover, Harkanwar Singh

ECCV 2024arXiv:2402.05892

106

citations

Nymeria: A Massive Collection of Egocentric Multi-modal Human Motion in the Wild

Lingni Ma, Yuting Ye, Rowan Postyeni et al.

ECCV 2024

Rethinking Image-to-Video Adaptation: An Object-centric Perspective

Rui Qian, Shuangrui Ding, Dahua Lin

ECCV 2024arXiv:2407.06871

citations

Spatial-Related Sensors Matters: 3D Human Motion Reconstruction Assisted with Textual Semantics

Xueyuan Yang, Chao Yao, Xiaojuan Ban

AAAI 2024paperarXiv:2401.05412

citations

Taylor Videos for Action Recognition

Lei Wang, Xiuyuan Yuan, Tom Gedeon et al.

ICML 2024oralarXiv:2402.03019

citations

Text-Guided Video Masked Autoencoder

David Fan, Jue Wang, Shuai Liao et al.

ECCV 2024arXiv:2408.00759

citations

X4D-SceneFormer: Enhanced Scene Understanding on 4D Point Cloud Videos through Cross-Modal Knowledge Transfer

Linglin Jing, Ying Xue, Xu Yan et al.

AAAI 2024paperarXiv:2312.07378

citations