Skip to main content
the negotiated now engineering the illusion of time

Chasing the Truth: NTP and Drift

5 min read Chapter 10 of 14
Summary

This section introduces the Network Time Protocol (NTP),...

This section introduces the Network Time Protocol (NTP), the fundamental system for synchronizing computer clocks over networks. It explains the hierarchical stratum model (Stratum 0: atomic/GPS sources, Stratum 1: primary servers, Stratum 2: secondary servers) and the propagation of error through each level. The core challenges of clock synchronization are detailed: clock drift (oscillator imperfections causing gradual divergence), clock skew (instantaneous difference between clocks), and network asymmetry (differing request/reply latencies). The text contrasts two correction methods: slewing (preferred gradual frequency adjustment) versus stepping (disruptive abrupt time jumps). It presents Marzullo's algorithm as NTP's fault-tolerant mechanism for selecting a consensus time from multiple servers by finding the intersection of their confidence intervals and rejecting outliers. The explanation is supported by concrete analogies (heartbeats for oscillators, rubber bands for network latency) and visual diagrams of the stratum hierarchy and interval intersection. Key entities introduced include David L. Mills (NTP inventor) and RFC 5905 (NTPv4 specification).

Chasing the Truth: NTP and Drift

The Network Time Protocol (NTP) is not merely a tool for setting clocks—it is a philosophical negotiation between the illusion of shared time and the chaotic reality of distributed physics. At its core, NTP orchestrates timing: the fragile, negotiated coordination of events across machines, distinct from time itself, which remains an ungraspable dimension we only ever sample imperfectly. This distinction is crucial. Time does not care if your server logs are out of order; timing does. And in distributed systems, timing is the scaffolding upon which causality—the assurance that cause precedes effect—and simultaneity—the contested claim that two events occurred at the same moment—are artificially, temporarily, constructed.

The NTP Stratum Hierarchy

At the heart of NTP is a hierarchical structure of time sources, known as strata—a pyramid of trust built on diminishing fidelity. Stratum 0 devices are high-precision time sources, such as atomic clocks or GPS receivers, which serve as the primary reference points for the entire network. These are as close as we get to a temporal oracle, though even they are bound by relativity and signal delay. Stratum 1 servers are directly connected to Stratum 0 devices and provide time to the network. Each subsequent stratum introduces not just latency, but epistemic distance: a growing uncertainty about what “now” truly means.

Diagram: NTP Stratum Hierarchy

                    [Stratum 0]
                    (Atomic Clock, GPS)
                         |
                         v
                    [Stratum 1]
          (Primary Time Server, syncs to Stratum 0)
                 /               \
                /                 \
               v                   v
        [Stratum 2]           [Stratum 2]
(Secondary Server, syncs    (Secondary Server, syncs
 to Stratum 1)               to Stratum 1)
        |                           |
        v                           v
    [Clients]                   [Clients]

This hierarchy is less a ladder to truth than a controlled descent into approximation. Yet it enables a shared fiction robust enough to prevent financial transactions from reversing, logs from lying, and security certificates from expiring prematurely.

Clock Drift and Skew

One of the primary challenges in maintaining accurate time synchronization is the inescapable reality of clock drift. Clock drift refers to the gradual divergence of a clock’s time from a reference time, due to imperfections in the clock’s oscillator—tiny quartz crystals that age, heat up, and falter like mortal hearts. This drift causes a clock to run faster or slower than the reference, accumulating error like dust on a mirror. Clock skew, in contrast, is the instantaneous difference in time between two clocks at a given moment—a snapshot of their misalignment. Drift is the disease; skew is the symptom.

Thought Experiment: The Heartbeat and the Rubber Band (A Physical Analogy)

Per Rule 1 of the style guide, this section serves as a physical analogy for abstract temporal concepts.

Imagine two servers, Alice and Bob, trying to synchronize their clocks over a network. Each server’s clock is like a heartbeat, with Alice’s heart beating slightly faster than Bob’s. Over time, Alice will think more time has passed than Bob. When Alice sends a time request to Bob, she attaches a timestamp to a message and sends it down a stretchy rubber band (the network). The message takes time to travel (latency), and Bob replies immediately, sending his timestamp back on another, possibly different, rubber band (network asymmetry). If one band is stretched longer than the other, Alice’s estimate of the travel time will be wrong, leading to skew. The rubber bands stretch and relax unpredictably—just like network paths under load—making simultaneity a negotiation, not a fact.

Slew vs. Step

When a clock discovers it is out of sync, it faces a metaphysical choice: slewing or stepping. Slewing adjusts the clock’s frequency gradually, stretching or compressing each second to realign with truth over time. It preserves monotonicity—the illusion that time flows forward smoothly—protecting applications that panic at backward jumps. Stepping, by contrast, is a violent correction: the clock is yanked to the correct time in an instant. It embraces disruptive truth over comforting continuity. The trade-off is stark: slewing maintains the appearance of stable time at the cost of prolonged inaccuracy; stepping restores accuracy at the price of causality violations. NTP usually chooses slewing—not because it’s truer, but because systems prefer a consistent lie to a jarring truth.

Marzullo’s Algorithm

To select a time from a set of possibly faulty sources, NTP employs Marzullo’s algorithm—a logic of consensus through interval arithmetic. Each server’s time estimate is treated as a range: (estimate ± error bound). The algorithm seeks the smallest interval where the maximum number of estimates overlap. The chosen time is the midpoint of the most densely supported intersection, effectively silencing outliers.

Diagram: Marzullo’s Algorithm Intersection

Imagine three time servers (A, B, C) report their time with error bounds:

  • Server A: 12:00:00 ± 0.050 sec (Interval: 11:59:59.950 to 12:00:00.050)
  • Server B: 12:00:05 ± 0.100 sec (Interval: 11:59:59.900 to 12:00:05.100) [Faulty/Outlier]
  • Server C: 12:00:01 ± 0.030 sec (Interval: 12:00:00.970 to 12:00:01.030)

Plot these as intervals on a number line:

Time Line:
...|---A---|.........|-------B-------|...|---C---|...>
     [11:59:59.950-12:00:00.050]       [11:59:59.900-12:00:05.100]  [12:00:00.970-12:00:01.030]

The intersection of A and C is from 12:00:00.970 to 12:00:00.050. This span has a count of 2 (A and C). B’s interval is too wide and disjoint; it is ignored as an outlier. The agreed-upon time is the midpoint of this intersection: ~12:00:00.510. Marzullo’s algorithm doesn’t find truth—it finds the most defensible compromise.

Conclusion

NTP is not a clockmaker’s tool but a diplomat of time. It does not deliver precision, stability, and simplicity all at once—no system can. Instead, it negotiates among them: sacrificing perfect accuracy for monotonicity, rejecting absolute truth for consensus, and replacing simultaneity with a collectively agreed-upon approximation. In doing so, NTP reveals a deeper truth: in distributed systems, time is not found, it is forged—a fragile, functional fiction, held together by rubber bands, heartbeats, and the quiet agreement to pretend we’re in sync.