"large-scale datasets" Papers

19 papers found

A Decade's Battle on Dataset Bias: Are We There Yet?

Zhuang Liu, Kaiming He

ICLR 2025arXiv:2403.08632
53
citations

DELT: A Simple Diversity-driven EarlyLate Training for Dataset Distillation

Zhiqiang Shen, Ammar Sherif, Zeyuan Yin et al.

CVPR 2025arXiv:2411.19946
11
citations

DH-FaceVid-1K: A Large-Scale High-Quality Dataset for Face Video Generation

Donglin Di, He Feng, Wenzhang SUN et al.

ICCV 2025arXiv:2410.07151
5
citations

Enhancing Noise-Robust Losses for Large-Scale Noisy Data Learning

Max Staats, Matthias Thamm, Bernd Rosenow

AAAI 2025paperarXiv:2306.05497
2
citations

New Algorithms for the Learning-Augmented k-means Problem

Junyu Huang, Qilong Feng, Ziyun Huang et al.

ICLR 2025
1
citations

OneBatchPAM: A Fast and Frugal K-Medoids Algorithm

Antoine de Mathelin, Nicolas Enrique Cecchi, François Deheeger et al.

AAAI 2025paperarXiv:2501.19285
2
citations

Scale Efficient Training for Large Datasets

Qing Zhou, Junyu Gao, Qi Wang

CVPR 2025arXiv:2503.13385
3
citations

Scaling Image Geo-Localization to Continent Level

Philipp Lindenberger, Paul-Edouard Sarlin, Jan Hosang et al.

NEURIPS 2025arXiv:2510.26795
2
citations

UltraHR-100K: Enhancing UHR Image Synthesis with A Large-Scale High-Quality Dataset

Chen Zhao, En Ci, Yunzhe Xu et al.

NEURIPS 2025arXiv:2510.20661
9
citations

Variational Bayesian Pseudo-Coreset

Hyungi Lee, Seungyoo Lee, Juho Lee

ICLR 2025arXiv:2502.21143

Data Filtering Networks

Alex Fang, Albin Madappally Jose, Amit Jain et al.

ICLR 2024arXiv:2309.17425
222
citations

Distill Gold from Massive Ores: Bi-level Data Pruning towards Efficient Dataset Distillation

YUE XU, Yong-Lu Li, Kaitong Cui et al.

ECCV 2024arXiv:2305.18381
8
citations

Generalized Large-Scale Data Condensation via Various Backbone and Statistical Matching

Shitong Shao, Zeyuan Yin, Muxin Zhou et al.

CVPR 2024highlightarXiv:2311.17950
57
citations

Large-Scale Multi-Hypotheses Cell Tracking Using Ultrametric Contours Maps

Jordao Bragantini, Merlin Lange, Loïc A Royer

ECCV 2024arXiv:2308.04526
15
citations

On Pretraining Data Diversity for Self-Supervised Learning

Hasan Abed El Kader Hammoud, Tuhin Das, Fabio Pizzati et al.

ECCV 2024arXiv:2403.13808
9
citations

Retrieval Robust to Object Motion Blur

Rong Zou, Marc Pollefeys, Denys Rozumnyi

ECCV 2024arXiv:2404.18025

Scalable Multiple Kernel Clustering: Learning Clustering Structure from Expectation

Weixuan Liang, En Zhu, Shengju Yu et al.

ICML 2024

Teddy: Efficient Large-Scale Dataset Distillation via Taylor-Approximated Matching

Ruonan Yu, Songhua Liu, Jingwen Ye et al.

ECCV 2024arXiv:2410.07579
13
citations

Wukong: Towards a Scaling Law for Large-Scale Recommendation

Buyun Zhang, Liang Luo, Yuxin Chen et al.

ICML 2024arXiv:2403.02545
78
citations