Takeaway

Causal trees partition covariate space to estimate conditional average treatment effects (CATE) with honesty to avoid overfitting.

The problem (before → after)

  • Before: Average treatment effects hide who benefits; naive trees overfit.
  • After: Honest splitting and cross-fitting produce valid CATE estimates with uncertainty control.

Mental model first

Think of gardening: you divide a field into plots (leaves) where the same fertilizer (treatment) has similar impact. Using separate samples for splitting and estimation avoids fooling yourself.

Just-in-time concepts

  • CATE τ(x) = E[Y(1) − Y(0) | X=x].
  • Honesty: Use one sample to choose splits, a disjoint sample to estimate effects.
  • Splitting criteria: Maximize treatment-effect heterogeneity while controlling variance.

First-pass solution

Grow a tree on a training split to find partitions; estimate τ̂ in leaves on a separate sample; prune via cross-validation; report uncertainty via leaf-level variance.

Iterative refinement

  1. Causal forests average many causal trees for stability.
  2. Doubly robust estimation improves efficiency.
  3. Policy learning selects treatments to maximize outcomes subject to constraints.

Principles, not prescriptions

  • Separate model selection from estimation to maintain validity.
  • Prefer simple, interpretable partitions when stakes are high.

Common pitfalls

  • Data leakage between split and estimate sets.
  • Sparse leaves inflate variance; prune aggressively.

Connections and contrasts

  • See also: [/blog/double-ml], [/blog/multi-armed-bandits], [/blog/simpsons-paradox].

Quick checks

  1. Why honesty? — Prevents adaptive overfitting of effects.
  2. What to split on? — Criteria targeting heterogeneity with variance control.
  3. Why forests? — Reduce variance by averaging many honest trees.

Further reading

  • Athey & Imbens, 2016 (source above)
  • Wager & Athey, causal forests