Heterogeneous Treatment Effects with Causal Trees

Takeaway

Causal trees partition covariate space to estimate conditional average treatment effects (CATE) with honesty to avoid overfitting.

The problem (before → after)

Before: Average treatment effects hide who benefits; naive trees overfit.
After: Honest splitting and cross-fitting produce valid CATE estimates with uncertainty control.

Mental model first

Think of gardening: you divide a field into plots (leaves) where the same fertilizer (treatment) has similar impact. Using separate samples for splitting and estimation avoids fooling yourself.

Just-in-time concepts

CATE τ(x) = E[Y(1) − Y(0) | X=x].
Honesty: Use one sample to choose splits, a disjoint sample to estimate effects.
Splitting criteria: Maximize treatment-effect heterogeneity while controlling variance.

First-pass solution

Grow a tree on a training split to find partitions; estimate τ̂ in leaves on a separate sample; prune via cross-validation; report uncertainty via leaf-level variance.

Causal forests average many causal trees for stability.
Doubly robust estimation improves efficiency.
Policy learning selects treatments to maximize outcomes subject to constraints.

Principles, not prescriptions

Separate model selection from estimation to maintain validity.
Prefer simple, interpretable partitions when stakes are high.

Common pitfalls

Data leakage between split and estimate sets.
Sparse leaves inflate variance; prune aggressively.

Connections and contrasts

See also: [/blog/double-ml], [/blog/multi-armed-bandits], [/blog/simpsons-paradox].

Quick checks

Why honesty? — Prevents adaptive overfitting of effects.
What to split on? — Criteria targeting heterogeneity with variance control.
Why forests? — Reduce variance by averaging many honest trees.