Consensus

Paxos, Raft, and all their flavors, variations, and alternatives.

Blog Posts

It’s fine to not understand every part of every post. The goal is to understand more at the end of a post than you did at the beginning, and if you achieve that on average, then induction says that if you keep reading, you’ll get there.

Exiting this, aim to understand that:

Consensus requires a quorum of votes.
Raft and Multi-Paxos extend the algorithm from one consensus decision to a log of consensus decisions.

There’s technical differences between Raft and Multi-Paxos, but those are irrelevant for now. They accomplish the same goal, in mostly the same way.

Textbooks

Introduction to Reliable and Secure Distributed Programming

Lectures:

Implementing Replicated Logs with Paxos presented by John Ousterhout
Dr. TLA+ Series: Paxos presented by Andrew Helwer

Exiting this, aim to have a decent understanding of how the basic Paxos algorithm works.

I personally really like the explanations of Paxos that evolve 2PC into something more reliable.

Survey Papers

Survey papers give context of what work exists and how they fit together:

Exiting this, aim to have an understanding of what major different variants of consensus exist, and roughly how they work.

Publications

And then dig into any of the papers listed on the Distributed Consensus Reading List.

But as always, for deeper understanding I’d recommend trying to read related papers together, than choosing randomly.

Side Commentary

There’s a few interesting tricks in consensus implementations to be aware of:

The "Megastore optimization", where leaderless consensus is used, but each proposal submitted by a replica includes nominating itself as the leader for the next round. This means that if client traffic is sticky to a single replica for writes, then the system will have the one-round efficiency of multi-paxos, but without the explicit leader lease.

↤ Userland Disk I/O ↑ How to Learn ↑