Fairness Is Not One Number — Adversarial Minds

There’s a version of AI fairness that makes for good press releases: your model has demographic parity. Except there are three standard fairness definitions, and it’s been mathematically proven that you cannot satisfy all three simultaneously in general. You have to choose, and that choice is a policy decision, not a technical one.

Three Definitions, One Impossibility

Demographic parity (statistical parity): the positive prediction rate is equal across groups. P(Ŷ=1

A=0) = P(Ŷ=1

A=1).

Equal opportunity (Hardt et al., 2016): equal true positive rates across groups. P(Ŷ=1

Y=1,A=0) = P(Ŷ=1

Y=1,A=1). You’re at least as likely to get a positive prediction if you’re a true positive, regardless of group.

Calibration: scores mean the same thing across groups. P(Y=1

Ŷ=s,A=0) = P(Y=1

Ŷ=s,A=1). A risk score of 70% corresponds to 70% actual risk, for both groups.

Chouldechova (2017) and Kleinberg et al. (2016) proved that when base rates differ across groups, you cannot simultaneously satisfy demographic parity, equal opportunity, and calibration. This is mathematics, not a failure of effort.

The COMPAS Case

In 2016, ProPublica accused COMPAS (a criminal recidivism prediction tool) of bias against Black defendants. Northpointe (the vendor) responded that COMPAS is calibrated. Both were correct — they were measuring different things.

COMPAS satisfies calibration: a risk score of 7 predicts similar recidivism rates regardless of race. It fails equal opportunity: the false positive rate (predicted high risk but didn’t reoffend) is about twice as high for Black defendants. This isn’t fixable without sacrificing calibration when base rates differ across groups.

The lesson: before you evaluate fairness, you have to decide which fairness you care about. That’s a values question, not an engineering question.

Equity vs. Equality

The distinction matters for deployment. Equality gives everyone the same treatment. Equity gives people what they need to reach the same outcome — which might mean different treatments. A decision maker committed to equity might design different interventions for different groups, intentionally violating demographic parity.

Rethinking Prediction in Decision Making

Focus less on accuracy metrics when predictions will be used to make consequential decisions. Predictions need not be the direct input to a decision — they can inform a human decision maker who also weighs context, error rates, and values. Separating prediction from decision is one of the key practical insights in ML fairness.

NeurIPS 2024: Fairness in Language Models

NeurIPS 2024 included work on multilingual fairness: MAGNET improves language model tokenization fairness across languages using adaptive gradient-based tokenization. This addresses a less-discussed fairness axis — models trained predominantly on English have worse performance and worse calibration for speakers of other languages. That’s fairness at the infrastructure level, not just the prediction level.

A separate NeurIPS 2024 track covered group robust preference optimization in RLHF: ensuring that alignment training doesn’t improve performance for majority groups at the expense of minority groups.

References: Chouldechova (2017); Kleinberg et al. (2016); Hardt et al. (2016) NeurIPS; Angwin et al. ProPublica (2016) on COMPAS.