Skip to main content
architecting resilient distributed systems high-scale engineering and failure mode mitigation

Consensus Algorithms in Practice

3 min read Chapter 3 of 13
Summary

Raft and Paxos are consensus algorithms for distributed...

Raft and Paxos are consensus algorithms for distributed systems. Raft is simpler and more understandable, while Paxos is more complex but robust.

Consensus Algorithms in Distributed Systems

Introduction to Consensus

Consensus algorithms are a fundamental component of distributed systems, enabling multiple processes to agree on a single data value or state transitions. Two prominent consensus algorithms are Raft and Paxos. Raft, designed by Diego Ongaro and John Ousterhout at Stanford University, aims to be more understandable than Paxos [1].

Raft Algorithm

Raft decomposes consensus into three subproblems: Leader Election, Log Replication, and Safety. A Raft leader handles all client requests, redirecting followers to itself if necessary. Log entries consist of a command, term number, and integer index. The ‘AppendEntries’ RPC is used for log replication and as a heartbeat mechanism [2].

Log Replication and Safety

A log entry is considered committed once replicated on a majority of cluster nodes. Raft ensures that if two entries in different logs have the same index and term, they store the same command. This property is crucial for maintaining consistency across the system.

Comparison with Paxos

Paxos is generally considered more complex to implement due to its lack of a built-in concept of a leader-driven log sequence. The following table compares key features of Raft and Paxos:

FeatureRaftPaxos (Basic/Multi)
ComplexityLow (by design)High
Leader RequirementExplicit (Strong Leader)Optional/Implicit
State ModelReplicated LogSingle/Multiple Value Agreement
UnderstandabilityOptimized for educationOptimized for proof/theory
Production Useetcd, Consul, K8sSpanner, Chubby

Implementation and Use Cases

The etcd Raft implementation is used by Kubernetes, CockroachDB, and TiDB. Raft supports ‘joint consensus’ for cluster membership changes, preventing split-brain during configuration transitions. Log compaction in Raft is typically handled via snapshotting, where the state machine state is written to disk and old log entries are discarded.

Conclusion

In conclusion, Raft and Paxos are both viable consensus algorithms for distributed systems, each with its strengths and weaknesses. Raft’s simplicity and understandability make it a popular choice for production environments, while Paxos’s complexity is offset by its robustness and flexibility. The choice between these algorithms ultimately depends on the specific requirements and constraints of the system being designed.

Sources

[1] Ongaro, D., & Ousterhout, J. (2014). In Search of an Understandable Consensus Algorithm. Proceedings of the 2014 USENIX Annual Technical Conference (USENIX ATC ‘14), 305–319. [2] https://github.com/etcd-io/raft [3] https://eli.thegreenplace.net/2020/implementing-raft-part-0-introduction/