The Simplest Database: A File, an Append, and a Linear Scan

Every database you have ever used is a file on disk with a strategy for writing to it and a strategy for reading from it. PostgreSQL, Redis, Kafka, RocksDB. Files on disk. The strategies differ. The foundation does not.

This chapter builds the simplest possible database: a single file, an append for every write, and a linear scan for every read. It is a terrible database. It is also the starting point that makes every subsequent storage engine in this book understandable, because every one of them is an optimization of this exact design, and every one of them preserves the append-only write pattern in some form.

The logistics platform needs a package tracking store. Every time a package changes status (scanned at warehouse, loaded on truck, out for delivery, delivered), the system records the event. The simplest version of this store is a file.

The Append-Only Log

Open a file. Append a line. Close the file. That is a write.

// Concept: append-only write to a flat file
// No buffering, no batching, no index. One write per event.
void recordEvent(Path dbFile, String packageId, String status, Instant timestamp)
        throws IOException {
    String line = packageId + "," + status + "," + timestamp + "\n";
    Files.writeString(dbFile, line, StandardOpenOption.CREATE, StandardOpenOption.APPEND);
}

This is not a toy simplification. This is the core write path of every write-ahead log in every database covered in this book. PostgreSQL’s WAL appends records to a segment file. Kafka appends messages to a partition log. RocksDB appends entries to its WAL before writing to the memtable. The mechanics around the append differ. The append itself does not.

Why append? Two reasons.

Sequential writes to disk are fast. An NVMe SSD can sustain 3,000 MB/s sequential write throughput. Random writes to the same drive drop to a fraction of that, because each write requires the controller to locate a new block, update the flash translation layer, and potentially trigger garbage collection. On spinning disks, the difference is even more extreme: sequential writes avoid seek time entirely.

Appending is crash-safe when combined with fsync. If the process crashes mid-write, the partial line is at the end of the file. The data before it is intact. An in-place update, by contrast, can corrupt existing data if the process crashes after writing half the record. This is why the append-only pattern survived from the earliest Unix systems into every modern storage engine.

The Linear Scan

Reading is where this design pays its cost. To find the current status of package PKG-40291, scan the entire file from the beginning, keeping track of the last entry for that package ID.

// Concept: linear scan read from a flat file
// O(n) in the number of total records, not the number of packages
String getLatestStatus(Path dbFile, String packageId) throws IOException {
    String latestStatus = null;
    try (var reader = Files.newBufferedReader(dbFile)) {
        String line;
        while ((line = reader.readLine()) != null) {
            String[] parts = line.split(",", 3);
            if (parts[0].equals(packageId)) {
                latestStatus = parts[1];
            }
        }
    }
    return latestStatus;
}

This scan reads every line in the file. If the file contains 10 million events across 500,000 packages, finding the status of one package reads all 10 million lines. The scan does not stop at the first match because the file is append-only: the latest status is the last entry for that package, not the first.

Where This Breaks

The logistics platform processes 2,000 package events per minute during peak hours. After one day, the file contains approximately 2.8 million lines. After one week, 20 million.

A linear scan of 20 million lines takes measurable time. On a machine with the file cached in the page cache, a sequential read of 20 million short lines takes roughly 3-5 seconds. When the file exceeds available memory and the OS must read from disk, the scan takes 15-30 seconds.

The warehouse dashboard queries package status on every page load. At 3 seconds per query, the dashboard is unusable. At 15 seconds, the operations team switches to phone calls.

The write path is fine. Appending one line to a file takes microseconds. The read path is what kills this design, and the read path is what every subsequent chapter in this book optimizes.

The append-only log structure showing sequential writes and full-file scan reads

The diagram shows the fundamental asymmetry of the append-only log. Writes land at the tail of the file in constant time. Reads must traverse from the head through every record to find the target. This asymmetry is the central tension of storage engine design, and every indexing strategy in Chapter 2 exists to resolve it.

The Pattern That Persists

This append-only file is not a dead end. It is the pattern that every storage system in this book builds on:

PostgreSQL’s WAL is an append-only log of changes, flushed to disk before the actual table pages are updated. If the server crashes, the WAL is replayed to recover committed transactions.

Kafka’s partition is an append-only log of messages. Consumers read by offset, scanning forward from their last known position. The log is the database.

RocksDB writes every mutation to a WAL before inserting into the in-memory memtable. When the memtable flushes to disk as an SSTable, the WAL segment can be discarded.

The next chapter introduces indexing: the price you pay to read fast. Every index is a secondary data structure that trades write overhead and storage space for faster reads. The append-only log remains the write path. The index is what you build alongside it.