Takeaway

Flows transform a simple base distribution into a complex target using a sequence of invertible maps; log-likelihoods are exact via the change-of-variables formula.

The problem (before → after)

  • Before: Powerful generative models often lack tractable likelihoods or exact sampling.
  • After: Invertible, differentiable layers yield both exact densities and efficient sampling.

Mental model first

Like kneading dough: each fold and stretch reshapes the density while preserving total mass; track how local volumes expand or shrink with the Jacobian determinant.

Just-in-time concepts

  • Change of variables: log p_X(x) = log p_Z(f(x)) + log |det J_f(x)|.
  • Coupling and autoregressive layers: triangular Jacobians make det cheap.
  • Expressivity: Depth and permutations mix dimensions; constraints keep maps invertible.

First-pass solution

Stack affine coupling layers with permutations; train by maximizing exact log-likelihood; sample by applying inverses to base noise.

Iterative refinement

  1. Continuous flows (Neural ODEs) trade determinants for ODE solves.
  2. Dequantization for discrete data; multiscale architectures for images.
  3. Hybrid models: flows for posteriors in VI or decoders in VAEs.

Code as a byproduct (affine coupling log-det)

import torch

def affine_coupling(x, s, t, mask):
    x1, x2 = x*mask, x*(1-mask)
    scale = torch.tanh(s(x1))
    shift = t(x1)
    y2 = (x2 * torch.exp(scale)) + shift
    y = x1 + y2
    logdet = ((1-mask) * scale).sum(dim=1)
    return y, logdet

Principles, not prescriptions

  • Design layers with cheap Jacobians and stable inverses.
  • Mix dimensions aggressively to avoid factorized bottlenecks.

Common pitfalls

  • Numerical issues computing log-dets; stabilize scales.
  • Limited expressivity if permutations/partitions are fixed and shallow.

Connections and contrasts

  • See also: [/blog/variational-inference], [/blog/diffusion-models], [/blog/gans].

Quick checks

  1. Why triangular Jacobians? — Determinant becomes the product of diagonal entries → cheap.
  2. How to sample? — Draw z ∼ base and invert the flow.
  3. Why flows in VI? — To make posteriors more expressive.

Further reading

  • RealNVP, Glow, Neural ODEs
  • Original flow paper (source above)