Takeaway

PBFT tolerates arbitrary (including malicious) faults by requiring 3f+1 replicas and 2f+1 quorums with authenticated messages across pre-prepare, prepare, and commit phases.

The problem (before → after)

  • Before: Crash-fault protocols fail under equivocation or lies.
  • After: Quorum certificates and message authentication detect and outvote Byzantine behavior.

Mental model first

Imagine traitors in an army: orders are accepted only when confirmed by enough independent messengers carrying signed copies.

Just-in-time concepts

  • Quorum sizes: 3f+1 nodes tolerate f faulty; 2f+1 signatures commit.
  • View changes for leader replacement.
  • Deterministic state machine replication.

First-pass solution

Leader proposes; replicas exchange prepares and commits; once 2f+1 commits, execute and reply; on faulty leader evidence or timeout, trigger view change.

Iterative refinement

  1. Optimizations: batching, speculative execution.
  2. Cryptographic choices: MACs vs signatures.
  3. BFT in WANs and permissionless settings (tendermint, hotstuff variants).

Principles, not prescriptions

  • Authenticate messages; require overlapping quorums.
  • Separate normal-case fast path from view change.

Common pitfalls

  • Inadequate view-change safety checks.
  • Non-determinism in application logic breaks replication.

Connections and contrasts

  • See also: [/blog/raft-consensus], crash fault-tolerant protocols, blockchain consensus.

Quick checks

  1. Why 3f+1? — To ensure 2f+1 honest nodes remain for quorums.
  2. Why three phases? — To prevent equivocation and ensure agreement.
  3. When view change? — Suspected faulty or slow leader.

Further reading

  • Castro & Liskov (source above); HotStuff, Tendermint