Differential Privacy — Privacy by Adding Noise

Takeaway

Differential privacy (DP) limits how much any single individual can affect an output by adding calibrated noise; guarantees compose gracefully across analyses.

The problem (before → after)

Before: Anonymization fails under linkage attacks; repeats leak information.
After: DP provides provable privacy guarantees with parameters (ε, δ) by perturbing outputs proportionally to sensitivity.

Mental model first

Imagine whispering answers in a noisy room: each person’s voice is masked by controlled static so you can hear the crowd, not any one person.

Just-in-time concepts

(ε, δ)-DP: Neighboring datasets produce similar output distributions.
Sensitivity: Max change in a function when one record is added/removed.
Mechanisms: Laplace, Gaussian; advanced composition and privacy accounting.

First-pass solution

Choose query f; compute sensitivity Δ; add noise scaled to Δ/ε; track cumulative privacy loss over multiple queries.

RDP/zCDP for tighter accounting.
Local DP for client-side privacy.
Private training: DP-SGD for deep learning with gradient clipping and noise.

Principles, not prescriptions

Budget privacy across analyses; report ε with results.
Match mechanism to query type and sensitivity.

Common pitfalls

Underestimating sensitivity and composition effects.
Reporting results without the privacy budget and utility trade-offs.

Connections and contrasts

See also: [/blog/secure-multiparty-computation], [/blog/zero-knowledge-proofs], [/blog/information-theory].

Quick checks

Why sensitivity? — Scales noise to bound individual influence.
What is ε? — Privacy loss parameter; smaller is more private.
How to train DP models? — DP-SGD: clip gradients and add Gaussian noise.