"object detection" Papers

132 papers found • Page 1 of 3

Adversarial Robustness of Discriminative Self-Supervised Learning in Vision

Ömer Veysel Çağatan, Ömer TAL, M. Emre Gursoy

ICCV 2025arXiv:2503.06361

AngleRoCL: Angle-Robust Concept Learning for Physically View-Invariant Adversarial Patches

Wenjun Ji, Yuxiang Fu, Luyang Ying et al.

NEURIPS 2025

Assessing Pre-Trained Models for Transfer Learning Through Distribution of Spectral Components

Tengxue Zhang, Yang Shu, Xinyang Chen et al.

AAAI 2025paperarXiv:2412.19085
7
citations

Availability-aware Sensor Fusion via Unified Canonical Space

Dong-Hee Paek, SEUNG-HYUN KONG

NEURIPS 2025arXiv:2503.07029
2
citations

Boosting Domain Generalized and Adaptive Detection with Diffusion Models: Fitness, Generalization, and Transferability

Boyong He, Yuxiang Ji, Zhuoyue Tan et al.

ICCV 2025highlightarXiv:2506.21042
1
citations

Correlated Low-Rank Adaptation for ConvNets

Wu Ran, Weijia Zhang, ShuYang Pang et al.

NEURIPS 2025

COUNTS: Benchmarking Object Detectors and Multimodal Large Language Models under Distribution Shifts

Jiansheng Li, Xingxuan Zhang, Hao Zou et al.

CVPR 2025highlightarXiv:2504.10158
1
citations

DeCLIP: Decoupled Learning for Open-Vocabulary Dense Perception

Junjie Wang, BIN CHEN, Yulin Li et al.

CVPR 2025arXiv:2505.04410
9
citations

Delving into Cascaded Instability: A Lipschitz Continuity View on Image Restoration and Object Detection Synergy

Qing Zhao, Weijian Deng, Pengxu Wei et al.

NEURIPS 2025arXiv:2510.24232

Detection, Pose Estimation and Segmentation for Multiple Bodies: Closing the Virtuous Circle

Miroslav Purkrabek, Jiri Matas

ICCV 2025arXiv:2412.01562
3
citations

DetectiumFire: A Comprehensive Multi-modal Dataset Bridging Vision and Language for Fire Understanding

Zixuan Liu, Siavash H. Khajavi, Guangkai Jiang

NEURIPS 2025arXiv:2511.02495

DuSA: Fast and Accurate Dual-Stage Sparse Attention Mechanism Accelerating Both Training and Inference

Chong Wu, Jiawang Cao, Renjie Xu et al.

NEURIPS 2025

EA-KD: Entropy-based Adaptive Knowledge Distillation

Chi-Ping Su, Ching-Hsun Tseng, Bin Pu et al.

ICCV 2025arXiv:2311.13621
3
citations

End-to-End Low-Light Enhancement for Object Detection with Learned Metadata from RAWs

Xuelin Shen, Haifeng Jiao, Yitong Wang et al.

NEURIPS 2025

Enhancing Transformers Through Conditioned Embedded Tokens

Hemanth Saratchandran, Simon Lucey

ICCV 2025arXiv:2505.12789
2
citations

EvRT-DETR: Latent Space Adaptation of Image Detectors for Event-based Vision

Dmitrii Torbunov, Yihui Ren, Animesh Ghose et al.

ICCV 2025arXiv:2412.02890
1
citations

FlexEvent: Towards Flexible Event-Frame Object Detection at Varying Operational Frequencies

Dongyue Lu, Lingdong Kong, Gim Hee Lee et al.

NEURIPS 2025oralarXiv:2412.06708
5
citations

FRBNet: Revisiting Low-Light Vision through Frequency-Domain Radial Basis Network

Fangtong Sun, Congyu Li, Ke Yang et al.

NEURIPS 2025arXiv:2510.23444

Frequency Dynamic Convolution for Dense Image Prediction

Linwei Chen, Lin Gu, Liang Li et al.

CVPR 2025arXiv:2503.18783
27
citations

Generalized Diffusion Detector: Mining Robust Features from Diffusion Models for Domain-Generalized Detection

Boyong He, Yuxiang Ji, Qianwen Ye et al.

CVPR 2025arXiv:2503.02101
5
citations

GPLQ: A General, Practical, and Lightning QAT Method for Vision Transformers

Guang Liang, Xinyao Liu, Jianxin Wu

NEURIPS 2025arXiv:2506.11784
4
citations

Harnessing Massive Satellite Imagery with Efficient Masked Image Modeling

Fengxiang Wang, Hongzhen Wang, Di Wang et al.

ICCV 2025arXiv:2406.11933
11
citations

HumorDB: Can AI understand graphical humor?

Vedaant V Jain, Gabriel Kreiman, Felipe Feitosa

ICCV 2025arXiv:2406.13564
1
citations

IFORMER: INTEGRATING CONVNET AND TRANSFORMER FOR MOBILE APPLICATION

Chuanyang Zheng

ICLR 2025arXiv:2501.15369
5
citations

LightsOut: Diffusion-based Outpainting for Enhanced Lens Flare Removal

Shr-Ruei Tsai, Wei-Cheng Chang, Jie-Ying Lee et al.

ICCV 2025arXiv:2510.15868
5
citations

MambaOut: Do We Really Need Mamba for Vision?

Weihao Yu, Xinchao Wang

CVPR 2025arXiv:2405.07992
193
citations

MI-DETR: An Object Detection Model with Multi-time Inquiries Mechanism

Zhixiong Nan, Xianghong Li, Tao Xiang et al.

CVPR 2025arXiv:2503.01463
7
citations

MobileODE: An Extra Lightweight Network

Le Yu, Jun Wu, Bo Gou et al.

NEURIPS 2025

Multi-Kernel Correlation-Attention Vision Transformer for Enhanced Contextual Understanding and Multi-Scale Integration

Hongkang Zhang, Shao-Lun Huang, Ercan KURUOGLU et al.

NEURIPS 2025

Multiple Object Tracking as ID Prediction

Ruopeng Gao, Ji Qi, Limin Wang

CVPR 2025arXiv:2403.16848
55
citations

NumbOD: A Spatial-Frequency Fusion Attack Against Object Detectors

Ziqi Zhou, Bowen Li, Yufei Song et al.

AAAI 2025paperarXiv:2412.16955
15
citations

OW-OVD: Unified Open World and Open Vocabulary Object Detection

Xing Xi, Yangyang Huang, Ronghua Luo et al.

CVPR 2025
3
citations

PBCAT: Patch-Based Composite Adversarial Training against Physically Realizable Attacks on Object Detection

Xiao Li, Yiming Zhu, Yifan Huang et al.

ICCV 2025arXiv:2506.23581
3
citations

PhysAug: A Physical-guided and Frequency-based Data Augmentation for Single-Domain Generalized Object Detection

Xiaoran Xu, Jiangang Yang, Wenhui Shi et al.

AAAI 2025paperarXiv:2412.11807
6
citations

Polyline Path Masked Attention for Vision Transformer

Zhongchen Zhao, Chaodong Xiao, Hui LIN et al.

NEURIPS 2025spotlightarXiv:2506.15940

R2Det: Exploring Relaxed Rotation Equivariance in 2D Object Detection

Zhiqiang Wu, Yingjie Liu, Hanlin Dong et al.

ICLR 2025arXiv:2408.11760
1
citations

Real-Time Scene-Adaptive Tone Mapping for High-Dynamic Range Object Detection

Gongzhe Li, Linwei Qiu, Peibei Cao et al.

NEURIPS 2025

ReCon: Region-Controllable Data Augmentation with Rectification and Alignment for Object Detection

Haowei Zhu, Tianxiang Pan, Rui Qin et al.

NEURIPS 2025spotlightarXiv:2510.15783
1
citations

Rectifying Magnitude Neglect in Linear Attention

Qihang Fan, Huaibo Huang, Yuang Ai et al.

ICCV 2025highlightarXiv:2507.00698
11
citations

ReRAW: RGB-to-RAW Image Reconstruction via Stratified Sampling for Efficient Object Detection on the Edge

Radu Berdan, Beril Besbinar, Christoph Reinders et al.

CVPR 2025arXiv:2503.03782
4
citations

R-LiViT: A LiDAR-Visual-Thermal Dataset Enabling Vulnerable Road User Focused Roadside Perception

Jonas Mirlach, Lei Wan, Andreas Wiedholz et al.

ICCV 2025arXiv:2503.17122
6
citations

ROD-MLLM: Towards More Reliable Object Detection in Multimodal Large Language Models

Heng Yin, Yuqiang Ren, Ke Yan et al.

CVPR 2025
8
citations

Semantic Equitable Clustering: A Simple and Effective Strategy for Clustering Vision Tokens

Qihang Fan, Huaibo Huang, Mingrui Chen et al.

ICCV 2025arXiv:2405.13337
3
citations

SoMA: Singular Value Decomposed Minor Components Adaptation for Domain Generalizable Representation Learning

Seokju Yun, Seunghye Chae, Dongheon Lee et al.

CVPR 2025highlightarXiv:2412.04077
10
citations

SpikePack: Enhanced Information Flow in Spiking Neural Networks with High Hardware Compatibility

Guobin Shen, Jindong Li, Tenglong Li et al.

ICCV 2025arXiv:2501.14484
1
citations

Style Evolving along Chain-of-Thought for Unknown-Domain Object Detection

Zihao Zhang, Aming Wu, Yahong Han

CVPR 2025highlightarXiv:2503.09968
1
citations

TARS: Traffic-Aware Radar Scene Flow Estimation

Jialong Wu, Marco Braun, Dominic Spata et al.

ICCV 2025arXiv:2503.10210

Task-driven Image Fusion with Learnable Fusion Loss

Haowen Bai, Jiangshe Zhang, Zixiang Zhao et al.

CVPR 2025highlightarXiv:2412.03240
21
citations

Task-Specific Zero-shot Quantization-Aware Training for Object Detection

Changhao Li, Xinrui Chen, Ji Wang et al.

ICCV 2025arXiv:2507.16782
1
citations

TESPEC: Temporally-Enhanced Self-Supervised Pretraining for Event Cameras

Mohammad Mohammadi, Ziyi Wu, Igor Gilitschenski

ICCV 2025arXiv:2508.00913
PreviousNext