"benchmark dataset" Papers

25 papers found

AerialVG: A Challenging Benchmark for Aerial Visual Grounding by Exploring Positional Relations

Junli Liu, Qizhi Chen, Zhigang Wang et al.

ICCV 2025arXiv:2504.07836
6
citations

AIComposer: Any Style and Content Image Composition via Feature Integration

Haowen Li, Zhenfeng Fan, Zhang Wen et al.

ICCV 2025arXiv:2507.20721
1
citations

Are VLMs Ready for Autonomous Driving? An Empirical Study from the Reliability, Data and Metric Perspectives

Shaoyuan Xie, Lingdong Kong, Yuhao Dong et al.

ICCV 2025arXiv:2501.04003
71
citations

BirdSet: A Large-Scale Dataset for Audio Classification in Avian Bioacoustics

Lukas Rauch, Raphael Schwinger, Moritz Wirth et al.

ICLR 2025arXiv:2403.10380
18
citations

Can LLMs Separate Instructions From Data? And What Do We Even Mean By That?

Egor Zverev, Sahar Abdelnabi, Soroush Tabesh et al.

ICLR 2025arXiv:2403.06833
45
citations

ClimaQA: An Automated Evaluation Framework for Climate Question Answering Models

Veeramakali Vignesh Manivannan, Yasaman Jafari, Srikar Eranky et al.

ICLR 2025arXiv:2410.16701
3
citations

COUNTS: Benchmarking Object Detectors and Multimodal Large Language Models under Distribution Shifts

Jiansheng Li, Xingxuan Zhang, Hao Zou et al.

CVPR 2025highlightarXiv:2504.10158
1
citations

CXPMRG-Bench: Pre-training and Benchmarking for X-ray Medical Report Generation on CheXpert Plus Dataset

Xiao Wang, Fuling Wang, Yuehang Li et al.

CVPR 2025arXiv:2410.00379
19
citations

DataSIR: A Benchmark Dataset for Sensitive Information Recognition

Fan Mo, Bo Liu, Yuan Fan et al.

NEURIPS 2025

DISTA-Net: Dynamic Closely-Spaced Infrared Small Target Unmixing

Shengdong Han, Shangdong Yang, Yuxuan Li et al.

ICCV 2025arXiv:2505.19148
1
citations

Do Large Language Models Truly Understand Geometric Structures?

Xiaofeng Wang, Yiming Wang, Wenhong Zhu et al.

ICLR 2025arXiv:2501.13773
9
citations

From Imitation to Innovation: The Emergence of AI's Unique Artistic Styles and the Challenge of Copyright Protection

Zexi Jia, Chuanwei Huang, Hongyan Fei et al.

ICCV 2025arXiv:2507.04769
3
citations

GSOT3D: Towards Generic 3D Single Object Tracking in the Wild

Yifan Jiao, Yunhao Li, Junhua Ding et al.

ICCV 2025arXiv:2412.02129

Interactive Medical Image Segmentation: A Benchmark Dataset and Baseline

Junlong Cheng, Bin Fu, Jin Ye et al.

CVPR 2025arXiv:2411.12814
27
citations

LawShift: Benchmarking Legal Judgment Prediction Under Statute Shifts

Zhuo Han, Yi Yang, Yi Feng et al.

NEURIPS 2025

MMQA: Evaluating LLMs with Multi-Table Multi-Hop Complex Questions

Jian Wu, Linyi Yang, Dongyuan Li et al.

ICLR 2025
23
citations

OpenRCA: Can Large Language Models Locate the Root Cause of Software Failures?

Junjielong Xu, Qinan Zhang, Zhiqing Zhong et al.

ICLR 2025
21
citations

RoomEditor: High-Fidelity Furniture Synthesis with Parameter-Sharing U-Net

Zhenyi Lin, Xiaofan Ming, Qilong Wang et al.

NEURIPS 2025

Shot2Story: A New Benchmark for Comprehensive Understanding of Multi-shot Videos

Mingfei Han, Linjie Yang, Xiaojun Chang et al.

ICLR 2025arXiv:2312.10300
46
citations

SonicSim: A customizable simulation platform for speech processing in moving sound source scenarios

Kai Li, Wendi Sang, Chang Zeng et al.

ICLR 2025arXiv:2410.01481
8
citations

A New Benchmark and Model for Challenging Image Manipulation Detection

Zhenfei Zhang, Mingyang Li, Ming-Ching Chang

AAAI 2024paperarXiv:2311.14218

DailyDVS-200: A Comprehensive Benchmark Dataset for Event-Based Action Recognition

Qi Wang, Zhou Xu, Yuming Lin et al.

ECCV 2024arXiv:2407.05106
14
citations

EgoCVR: An Egocentric Benchmark for Fine-Grained Composed Video Retrieval

Thomas Hummel, Shyamgopal Karthik, Mariana-Iuliana Georgescu et al.

ECCV 2024arXiv:2407.16658
18
citations

HARDVS: Revisiting Human Activity Recognition with Dynamic Vision Sensors

Xiao Wang, Zongzhen Wu, Bo Jiang et al.

AAAI 2024paperarXiv:2211.09648
69
citations

Towards More Practical Group Activity Detection: A New Benchmark and Model

Dongkeun Kim, Youngkil Song, Minsu Cho et al.

ECCV 2024arXiv:2312.02878
10
citations