Can Private Machine Learning Be Fair?

1citations
PDFProject
1
citations
#1733
in AAAI 2025
of 3028 papers
2
Top Authors
2
Data Points

Abstract

We show that current SOTA methods for privately and fairly training models are unreliable in many practical scenarios. Specifically, we (1) introduce a new type of adversarial attack that seeks to introduce unfairness into private model training, and (2) demonstrate that the use of methods for training on private data that are robust to adversarial attacks often leads to unfair models, regardless of the use of fairness-enhancing training methods. This leads to a dilemma when attempting to train fair models on private data: either (A) we use a robust training method which may introduce unfairness to the model itself, or (B) we train models which are vulnerable to adversarial attacks that introduce unfairness. This paper highlights flaws in robust learning methods when training fair models, yielding a new perspective for the design of robust and private learning systems.

Citation History

Jan 27, 2026
1
Feb 13, 2026
1