Consistency Models and State Coordination - architecting resilient distributed systems high-scale engineering and failure mode mitigation • Dev|Journal

Consistency Models in Distributed Systems

Introduction to Consistency Models

Distributed systems are inherently complex, requiring careful consideration of trade-offs between consistency, availability, and latency. At the heart of these systems lie various consistency models, each with its strengths and weaknesses. This section delves into the definitions, implications, and comparisons of key consistency models, including linearizability, sequential consistency, and eventual consistency, to equip system designers with the knowledge necessary to select the most appropriate model based on their specific latency and availability requirements.

Definitions and Implications of Consistency Models

Linearizability is the strongest consistency model for single-object operations, ensuring that a write is visible to all subsequent reads across all nodes at the exact moment it completes. This model makes the distributed system appear as a single machine but comes with a cost: it requires at least $u/4$ time for reads and $u/2$ time for writes, where $u$ is the network delay, as proven by Attiya and Welch [1].
Sequential Consistency ensures that the result of any execution is the same as if the operations of all processors were executed in some sequential order. While weaker than linearizability, it still provides a strong guarantee but with potentially better performance in systems with variable network delays.
Eventual Consistency, on the other hand, is a weak consistency model where all replicas eventually converge to the same value in the absence of new updates. This model permits temporary stale reads and is designed for high availability during partitions but may not be suitable for applications requiring strong consistency.

Comparative Analysis of Consistency Models

The choice of consistency model significantly affects the performance and reliability of a distributed system. Linearizability and Sequential Consistency offer strong guarantees but at the cost of increased latency due to the need for synchronization across nodes. In contrast, Eventual Consistency prioritizes availability and lower latency but may lead to stale reads. The CAP theorem states that a distributed system can only provide two out of three guarantees: Consistency, Availability, and Partition Tolerance, highlighting the inherent trade-offs.

Example Systems and Their Consistency Models

YugabyteDB is classified as a CP (Consistent and Partition-tolerant) system under the CAP theorem, indicating its priority on consistency over availability during partitions.
Cassandra, when configured with ‘ANY’ consistency, provides eventual consistency and high availability, making it suitable for applications that can tolerate stale reads.
CockroachDB uses a variant of Raft to provide serializable transactions and linearizable reads, demonstrating a strong focus on consistency.

Evaluating Consistency Models with Jepsen

Jepsen, a black-box testing framework developed by Kyle Kingsbury, is instrumental in evaluating the safety and correctness of distributed systems under fault injection. By analyzing the performance of various databases under Jepsen tests, system designers can make informed decisions about the appropriateness of different consistency models for their applications. For instance, PostgreSQL 12.3 was found to violate its own isolation promises under specific single-machine scenarios, while MongoDB v3.4.0 failed Jepsen tests due to ‘dirty reads’ even at the ‘majority’ read concern level.

Conclusion

In conclusion, the selection of a consistency model in distributed systems is crucial and depends on the specific requirements of the application, including latency, availability, and the tolerance for stale reads. By understanding the definitions, implications, and trade-offs of different consistency models and leveraging tools like Jepsen for evaluation, system designers can create more robust and performant distributed systems.

Sources

[1] Attiya, H., & Welch, J. L. (1994). Sequential Consistency versus Linearizability. ACM Transactions on Computer Systems (TOCS). [2] https://www.yugabyte.com/blog/jepsen-testing-on-yugabyte-db-database/ [3] https://jepsen.io/analyses