"data filtering" Papers
9 papers found
Conference
D3: A Dataset for Training Code LMs to Act Diff-by-Diff
Ulyana Piterbarg, Kanishk Gandhi, Lerrel Pinto et al.
COLM 2025paper
Distillation Robustifies Unlearning
Bruce W, Lee, Addie Foote, Alex Infanger et al.
NEURIPS 2025spotlightarXiv:2506.06278
6
citations
Joint Out-of-Distribution Filtering and Data Discovery Active Learning
Sebastian Schmidt, Leonard Schenk, Leo Schwinn et al.
CVPR 2025arXiv:2503.02491
8
citations
Recycling the Web: A Method to Enhance Pre-training Data Quality and Quantity for Language Models
Thao Nguyen, Yang Li, Olga Golovneva et al.
COLM 2025paperarXiv:2506.04689
13
citations
ReSpec: Relevance and Specificity Grounded Online Filtering for Learning on Video-Text Data Streams
Chris Dongjoo Kim, Jihwan Moon, Sangwoo Moon et al.
CVPR 2025arXiv:2504.14875
1
citations
Understanding the Gain from Data Filtering in Multimodal Contrastive Learning
Divyansh Pareek, Sewoong Oh, Simon Du
NEURIPS 2025arXiv:2512.14230
Contrastive Representation for Data Filtering in Cross-Domain Offline Reinforcement Learning
Xiaoyu Wen, Chenjia Bai, Kang Xu et al.
ICML 2024arXiv:2405.06192
10
citations
DUEL: Duplicate Elimination on Active Memory for Self-Supervised Class-Imbalanced Learning
Won-Seok Choi, Hyundo Lee, Dong-Sig Han et al.
AAAI 2024paperarXiv:2402.08963
4
citations
Learning and Forgetting Unsafe Examples in Large Language Models
Jiachen Zhao, Zhun Deng, David Madras et al.
ICML 2024oralarXiv:2312.12736
25
citations