Transactions: Maintaining Invariants
SummaryThis section introduces the ACID properties as the...
This section introduces the ACID properties as the...
This section introduces the ACID properties as the fundamental contract between database systems and applications for maintaining invariants during failures and concurrency. It explains Atomicity as implemented via the Write-Ahead Log (WAL), where abort acts as the critical safety valve to roll back incomplete transactions. The draft covers isolation levels (Read Uncommitted, Read Committed, Repeatable Read, Serializable) and the specific concurrency anomalies each level permits. It presents core ACID mechanisms (WAL, Two-Phase Locking, MVCC, Two-Phase Commit) with their respective trade-offs in amplification (write, throughput, storage, latency). The section includes illustrative Python code for a Two-Phase Commit coordinator, tables mapping isolation levels to allowed anomalies and mechanisms to their trade-offs, and a detailed diagram description of the transaction lifecycle with abort paths. Key concepts established include the immutable trade-off between isolation strength and performance, and how the WAL serves as the single source of truth enabling recovery and rollback.
Transactions: Maintaining Invariants
ACID Properties
The ACID properties—Atomicity, Consistency, Isolation, and Durability—are fundamental guarantees that database transactions must satisfy to ensure reliable processing in the presence of failures and concurrency. Atomicity guarantees that a transaction is treated as a single, indivisible unit of work, either happening completely or not at all. Consistency ensures that a transaction brings the database from one valid state to another, preserving all defined integrity constraints. Isolation guarantees that concurrent transactions do not interfere with each other, with different isolation levels defining the degree of visibility between them. Durability ensures that once a transaction is committed, its changes are permanent and survive subsequent system failures.
Atomicity and the Write-Ahead Log (WAL)
Atomicity is implemented using a Write-Ahead Log (WAL): all intended changes are written to a durable log before they are applied to the main data structures. The abort operation is the fundamental safety mechanism for enforcing atomicity: if a transaction cannot proceed, the system can discard its partial work by rolling back changes listed in its WAL records. The WAL record for a transaction must contain enough information to both redo the changes (for recovery after a crash) and undo them (for rollback).
Diagram: Lifecycle of a Transaction with Abort Safety Valve
The following diagram illustrates the lifecycle of a transaction, including the normal path and the abort safety valve:
Components:
- Transaction Begin (T1)
- WAL Buffer in Memory
- Stable Storage (Disk)
- Main Data Store (B-Tree/LSM-Tree)
- Abort/Rollback Path (Red dashed line)
Flow:
- Write Operation: T1 writes value V1 to key K.
- Arrow from (1) to (2): Log record
[T1, K, old_value, V1]appended to in-memory WAL buffer.
- Arrow from (1) to (2): Log record
- WAL Force (Commit Point): On commit request, WAL buffer is flushed to Stable Storage (3). This is the durability point.
- Apply to Data Store: After log is stable, the change is applied asynchronously to the Main Data Store (4) (e.g., update B-Tree page or flush to SSTable).
Abort Path (Safety Valve):
- If T1 must abort before WAL force: Simply discard the log record from the WAL buffer (2). No durable state changed.
- If T1 must abort after WAL force but before applying to Data Store: An
ABORTrecord is written to the WAL and flushed. During recovery, the system will see the abort and skip applying T1’s changes. - If crash occurs during application: Recovery replays the WAL. For T1, it finds the log record. If no commit record is found, it applies the
old_valuefrom the log record to the Data Store (4), undoing the change.
Isolation Levels
Isolation levels define the degree to which operations in one transaction are isolated from operations in other concurrent transactions. Common levels include Read Uncommitted, Read Committed, Repeatable Read, and Serializable. Each level allows specific concurrency anomalies:
| Isolation Level (ANSI SQL) | Dirty Read | Non-repeatable Read | Phantom Read | Write Skew | Common Implementation Technique |
|---|---|---|---|---|---|
| Read Uncommitted | Allowed | Allowed | Allowed | Allowed | Read locks not acquired. |
| Read Committed | Prevented | Allowed | Allowed | Allowed | Short-term read locks or MVCC snapshot per statement. |
| Repeatable Read | Prevented | Prevented | Allowed | Allowed | Hold read locks on accessed rows (2PL) or MVCC snapshot per transaction. |
| Serializable | Prevented | Prevented | Prevented | Prevented | Strict 2PL, Serializable Snapshot Isolation (SSI), or predicate locking. |
Mechanisms and Trade-offs
Each ACID guarantee is enforced by specific internal mechanisms, each of which incurs a measurable cost under failure and normal operation. The choice of mechanism is not arbitrary—it reflects an immutable tradeoff between safety, performance, and complexity.
| Mechanism | Provides (ACID) | Key Internal Action on Failure/Abort | Amplification Trade-off |
|---|---|---|---|
| Write-Ahead Log (WAL) | Atomicity, Durability | Roll forward (redo) committed txns; Roll back (undo) uncommitted txns. | Write Amplification: Every change written twice (log + data). Latency: Must flush log for durability. |
| Two-Phase Locking (2PL) | Isolation (Serializable) | Release all locks held by the aborted transaction. | Throughput Amplification: Lock contention increases latency, may cause deadlocks requiring aborts. |
| Multi-Version Concurrency Control (MVCC) | Isolation (Snapshot-based levels) | Discard versions created by the aborted transaction; no locks to release. | Storage Amplification: Must retain old versions until no active snapshot needs them. |
| Two-Phase Commit (2PC) | Atomicity (Distributed) | Send ABORT decision to all prepared participants; they undo using their local WAL. | Latency Amplification: Two network rounds + blocking risk on coordinator failure. |
Example: Two-Phase Commit Coordinator
The following Python code illustrates a Two-Phase Commit coordinator:
from dataclasses import dataclass
from typing import List, Protocol
# Define interface for participants
class Participant(Protocol):
def prepare(self, transaction_id: str) -> bool: ...
def commit(self, transaction_id: str) -> None: ...
def abort(self, transaction_id: str) -> None: ...
@dataclass
class WALRecord:
transaction_id: str
key: str
old_value: str
new_value: str
# Placeholder exception for network or participant failure
class ParticipantFailure(Exception):
pass
def _write_decision_to_wal(decision: str, transaction_id: str) -> None:
# Simulate writing coordinator's decision to durable log
pass
def _send_abort_to_all(participants: List[Participant], tid: str) -> None:
"""Sends abort decision to all participants who prepared."""
_write_decision_to_wal('abort', tid)
for p in participants:
try:
p.abort(tid)
except ParticipantFailure:
pass # Participant will abort on recovery via WAL.
def two_phase_commit_coordinator(participants: List[Participant]) -> bool:
"""
Modern Python 3.11+ illustrative example of a Two-Phase Commit coordinator.
Assumes each participant has methods: prepare(), commit(), abort().
"""
transaction_id = generate_unique_id()
prepared_participants = []
# --- PHASE 1: PREPARE ---
for participant in participants:
try:
# Ask participant if it can commit.
# Participant must write a 'prepare' record to its durable WAL.
can_commit = participant.prepare(transaction_id)
if not can_commit:
# Participant voted NO. Must abort globally.
_send_abort_to_all(prepared_participants, transaction_id)
return False
prepared_participants.append(participant)
except ParticipantFailure:
# Participant crashed or unreachable. Must abort globally.
_send_abort_to_all(prepared_participants, transaction_id)
return False
# --- PHASE 2: COMMIT (or ABORT) ---
# Coordinator's decision point. Writes its own 'commit' decision to durable log.
_write_decision_to_wal('commit', transaction_id)
commit_ok = True
for participant in prepared_participants:
try:
participant.commit(transaction_id)
except ParticipantFailure:
# Participant may have crashed after preparing.
# It must recover and consult the coordinator's decision.
# The commit is considered potentially successful but retry is needed.
commit_ok = False # Mark for retry
# Coordinator acknowledges success to client, even if some commits need retry.
# The protocol guarantees eventual durability.
return True
The abort safety valve—whether through log truncation, version discarding, or distributed abort messages—is not an edge case but a core invariant enforcement mechanism. It ensures that partial or failed transactions never corrupt the system’s consistency, making atomicity and isolation not aspirations but enforceable guarantees. Invariants are maintained not by preventing failure, but by designing every component to recover from it deterministically.