Takeaway

Paxos gets a cluster to agree on a value despite crashes and message delays by using quorums and epochs (proposal numbers) to ensure safety.

The problem (before → after)

  • Before: Nodes crash and messages arrive out of order; naive broadcasts can diverge.
  • After: Majority quorums and monotonic proposal numbers enforce a single chosen value that never changes once decided.

Mental model first

Think of a committee voting with numbered ballots: later ballots supersede earlier ones, but any decision needs a majority; overlapping majorities ensure consistency.

Just-in-time concepts

  • Roles: Proposers, acceptors, learners.
  • Phases: Prepare/Promise (phase 1), Accept/Accepted (phase 2).
  • Quorums: Any two majorities intersect.

First-pass solution

Proposer picks n, asks acceptors to promise not to accept < n; collects highest prior accepted value; proposes that value (or its own if none); majority accepts → chosen.

Iterative refinement

  1. Multi-Paxos amortizes phase 1 for a stream of values (a leader).
  2. Failure handling: timeouts, re-election, stable storage for promises.
  3. Replicated state machines apply a log of chosen values.

Principles, not prescriptions

  • Safety over liveness: never decide two different values.
  • Use majority intersections to preserve history across failures.

Common pitfalls

  • Ignoring persistent storage of promises/accepts.
  • Leader thrashing without backoff.

Connections and contrasts

  • See also: [/blog/raft-consensus], [/blog/byzantine-fault-tolerance], [/blog/crdts], [/blog/cap-theorem].

Quick checks

  1. Why majorities? — Intersections carry forward chosen values.
  2. Why two phases? — Prevents late proposals from overwriting decisions.
  3. Why Multi-Paxos? — Efficiency for many decisions.

Further reading

  • Lamport’s “Paxos Made Simple” (source above)