"model unlearning" Papers

9 papers found

Filters:model unlearning Clear all

Conference

AAAI 2025 (3,028)COLM 2025 (418)CVPR 2025 (2,873)ICCV 2025 (2,701)ICLR 2025 (3,827)ICML 2025 (3,340)ISMAR 2025 (229)NEURIPS 2025 (5,858)AAAI 2024 (2,289)CVPR 2024 (2,716)ECCV 2024 (2,387)ICLR 2024 (2,297)ICML 2024 (2,635)

Paper Type

poster (24,624)paper (8,558)oral (1,594)spotlight (1,421)highlight (975)

Concept Bottleneck Large Language Models

Chung-En Sun, Tuomas Oikarinen, Berk Ustun et al.

ICLR 2025arXiv:2412.07992

ConceptPrune: Concept Editing in Diffusion Models via Skilled Neuron Pruning

Ruchika Chavhan, Da Li, Timothy Hospedales

ICLR 2025arXiv:2405.19237

Distillation Robustifies Unlearning

Bruce W, Lee, Addie Foote, Alex Infanger et al.

NEURIPS 2025spotlightarXiv:2506.06278

Explainable Reinforcement Learning from Human Feedback to Improve Alignment

Shicheng Liu, Siyuan Xu, Wenjie Qiu et al.

NEURIPS 2025arXiv:2512.13837

Exploring and Leveraging Class Vectors for Classifier Editing

Jaeik Kim, Jaeyoung Do

NEURIPS 2025arXiv:2510.11268

On Effects of Steering Latent Representation for Large Language Model Unlearning

Huu-Tien Dang, Tin Pham, Hoang Thanh-Tung et al.

AAAI 2025paperarXiv:2408.06223

AND: Audio Network Dissection for Interpreting Deep Acoustic Models

Tung-Yu Wu, Yu-Xiang Lin, Lily Weng

ICML 2024arXiv:2406.16990

Safe-CLIP: Removing NSFW Concepts from Vision-and-Language Models

Samuele Poppi, Tobia Poppi, Federico Cocchi et al.

ECCV 2024arXiv:2311.16254

The WMDP Benchmark: Measuring and Reducing Malicious Use with Unlearning

Nathaniel Li, Alexander Pan, Anjali Gopal et al.

ICML 2024arXiv:2403.03218