Takeaway

Under assumptions (causal sufficiency, faithfulness, acyclicity), algorithms can infer aspects of causal structure from observational data via conditional independencies or scores.

The problem (before → after)

  • Before: Correlation alone doesn’t reveal direction or confounding.
  • After: Conditional independence patterns restrict DAGs; scores and interventions refine orientation.

Mental model first

It’s detective work: alibis (independencies) rule out suspects (edges); remaining orientations follow from logic plus minimality assumptions.

Just-in-time concepts

  • PC/FCI (constraint-based); GES (score-based); LiNGAM (non-Gaussian).
  • Markov equivalence, CPDAGs, and PAGs under latent confounding.
  • Intervention and invariance strengthen identification.

First-pass solution

Test conditional independencies; build skeleton; orient edges with v-structures and rules; or search DAG space to maximize a score with penalties.

Iterative refinement

  1. Latent confounders: FCI and algorithms with partial ancestral graphs.
  2. Nonlinear/non-Gaussian models identify directions (ANM, LiNGAM).
  3. Invariant causal prediction across environments.

Principles, not prescriptions

  • Combine multiple sources: independence, asymmetries, interventions, and invariance.
  • Beware finite-sample errors in CI testing.

Common pitfalls

  • Violating assumptions (e.g., hidden confounding) misleads discovery.
  • Overconfidence: output is an equivalence class, not a single DAG.

Connections and contrasts

  • See also: [/blog/causal-inference-do-calculus], [/blog/causal-trees], [/blog/double-ml].

Quick checks

  1. What’s a CPDAG? — Represents all DAGs with the same CI relations.
  2. Why non-Gaussian helps? — Breaks symmetry in direction detection.
  3. How to validate? — Interventional tests or invariance across environments.

Further reading

  • Spirtes, Glymour, Scheines; Peters, Janzing, Schölkopf