Simpson’s Paradox
Takeaway
Aggregated data can reverse trends present within groups; stratification and causal reasoning resolve the apparent contradiction.
The problem (before → after)
- Before: A treatment seems worse overall but better in every subgroup.
- After: Differences in group sizes or confounding explain reversals; analyze within strata or adjust using causal criteria.
Mental model first
Mixing apples and oranges: a weighted mix can flip an average if the weights differ across groups. The cure is to compare apples with apples—like-for-like.
Just-in-time concepts
- Stratification and weighted averages.
- Confounding and collider bias.
- Causal graphs to decide adjustment sets.
First-pass solution
Compute within-strata effects; compare to aggregate; inspect group weights; if warranted, adjust using backdoor sets or standardization.
Iterative refinement
- Sensitivity analysis for unmeasured confounding.
- Transportability to new populations.
- Presentation: Show both stratified and aggregate views.
Principles, not prescriptions
- Always stratify on relevant factors; avoid misleading aggregates.
- Use causal diagrams to choose valid adjustments.
Common pitfalls
- Adjusting for colliders introduces bias and new paradoxes.
- Ignoring base-rate differences across groups.
Connections and contrasts
- See also: [/blog/causal-inference-do-calculus], [/blog/causal-trees], [/blog/multi-armed-bandits].
Quick checks
- Why can averages flip? — Different group weights skew the aggregate.
- How to resolve? — Stratify or adjust using causal criteria.
- When not to adjust? — Avoid colliders and post-treatment variables.
Further reading
- Original JRSS paper; textbooks on causal inference