Maintaining Consistency in Denormalized Schemas

The Symptom

The sensor metadata for sensor-00042 is updated: the sensor moved from Building HQ, Floor 3 to Building B2, Floor 7. The update to the sensors collection takes 2ms. But 200 million reading documents still embed the old metadata. The dashboard shows sensor-00042 in Building HQ, Floor 3 because the readings collection is not yet updated.

The Cause

Denormalization pushes the consistency burden from read time to write time. When the source of truth changes, all copies must be updated. Without a propagation mechanism, copies become stale.

The Benchmark

Measure the cost of different consistency propagation strategies:

@BenchmarkMode(Mode.SingleShotTime)
@OutputTimeUnit(TimeUnit.SECONDS)
@Fork(1)
@State(Scope.Benchmark)
public class ConsistencyPropagationBenchmark {

    private MongoCollection<Document> readings;
    private int documentsToUpdate;

    @Param({"10000", "100000", "1000000"})
    private int matchingDocuments;

    @Setup
    public void setup() {
        MongoClient client = MongoClients.create("mongodb://localhost:27017");
        readings = client.getDatabase("telemetry").getCollection("readings");
        documentsToUpdate = matchingDocuments;
    }

    @Benchmark
    public long singleUpdateMany() {
        UpdateResult result = readings.updateMany(
            Filters.eq("sensorId", "sensor-00042"),
            Updates.combine(
                Updates.set("sensor.building", "B2"),
                Updates.set("sensor.floor", 7)
            )
        );
        return result.getModifiedCount();
    }

    @Benchmark
    public long batchedBulkWrite() {
        long modified = 0;
        int batchSize = 10000;

        while (true) {
            List<Document> batch = readings.find(
                Filters.and(
                    Filters.eq("sensorId", "sensor-00042"),
                    Filters.ne("sensor.building", "B2")
                )
            ).limit(batchSize)
             .projection(Projections.include("_id"))
             .into(new ArrayList<>());

            if (batch.isEmpty()) break;

            List<WriteModel<Document>> writes = batch.stream()
                .map(doc -> new UpdateOneModel<Document>(
                    Filters.eq("_id", doc.get("_id")),
                    Updates.combine(
                        Updates.set("sensor.building", "B2"),
                        Updates.set("sensor.floor", 7),
                        Updates.set("sensor.metaVersion", 2)
                    )
                ))
                .toList();

            BulkWriteResult result = readings.bulkWrite(writes,
                new BulkWriteOptions().ordered(false));
            modified += result.getModifiedCount();
        }
        return modified;
    }
}

Results:

Benchmark                                 (matchingDocuments)   Mode  Cnt     Score   Error  Units
ConsistencyPropagationBenchmark.singleUpdateMany       10000  ss       1     0.800           s/op
ConsistencyPropagationBenchmark.singleUpdateMany      100000  ss       1     8.200           s/op
ConsistencyPropagationBenchmark.singleUpdateMany     1000000  ss       1    82.000           s/op
ConsistencyPropagationBenchmark.batchedBulkWrite       10000  ss       1     1.200           s/op
ConsistencyPropagationBenchmark.batchedBulkWrite      100000  ss       1    10.500           s/op
ConsistencyPropagationBenchmark.batchedBulkWrite     1000000  ss       1    95.000           s/op

updateMany is faster than batched bulk writes because it avoids the per-document _id lookup overhead. But updateMany holds a write lock for the entire operation. For 1 million documents, that is 82 seconds of blocked writes to that collection.

The Fix

Three strategies, chosen based on staleness tolerance.

Strategy 1: Version-based lazy update (minutes of staleness tolerated).

Add a version number to the embedded metadata. When reading, check the version against the source of truth. If stale, update on read:

// FAST: Lazy consistency with version check
public Document getReadingWithCurrentMetadata(String readingId) {
    Document reading = readings.find(Filters.eq("_id", readingId)).first();
    String sensorId = reading.getString("sensorId");

    // Check version against cached current version
    int embeddedVersion = reading.getEmbedded(
        List.of("sensor", "metaVersion"), Integer.class, 0);
    int currentVersion = sensorMetadataVersionCache.get(sensorId);

    if (embeddedVersion < currentVersion) {
        Document currentMeta = sensorMetadataCache.get(sensorId);
        readings.updateOne(
            Filters.eq("_id", readingId),
            Updates.set("sensor", currentMeta)
        );
        reading.put("sensor", currentMeta);
    }

    return reading;
}

This spreads the update cost across read operations. Hot documents (frequently read) are updated quickly. Cold documents (rarely read) stay stale but do not consume update resources.

Strategy 2: Background bulk propagation (hours of staleness tolerated).

Run a background job that propagates metadata changes in batches:

// FAST: Background propagation with rate limiting
public void propagateMetadataChange(String sensorId, Document newMetadata) {
    int batchSize = 5000;
    long sleepBetweenBatches = 100; // ms, to avoid saturating I/O

    while (true) {
        UpdateResult result = readings.updateMany(
            Filters.and(
                Filters.eq("sensorId", sensorId),
                Filters.lt("sensor.metaVersion", newMetadata.getInteger("metaVersion"))
            ),
            Updates.set("sensor", newMetadata),
            new UpdateOptions().hint(new Document("sensorId", 1).append("ts", -1))
        );

        if (result.getModifiedCount() == 0) break;

        // Rate limit to avoid overwhelming the server
        try { Thread.sleep(sleepBetweenBatches); } catch (InterruptedException e) {
            Thread.currentThread().interrupt();
            break;
        }
    }
}

Strategy 3: Targeted denormalization (zero staleness required).

Only denormalize fields that never change. Sensor type (“temperature”) does not change. Building and floor might. Denormalize the immutable fields, use $lookup for the mutable ones:

// FAST: Partial denormalization of immutable fields
Document doc = new Document()
    .append("sensorId", reading.getSensorId())
    .append("ts", Date.from(reading.getTimestamp()))
    .append("temperature", reading.getTemperature())
    .append("sensorType", sensorMeta.getString("type"));  // Immutable: safe to embed
    // building and floor: NOT embedded, fetched via $lookup when needed

The Proof

Strategy	Staleness	Write overhead	Read overhead	Complexity
No denormalization ($lookup)	Zero	None	18.7x slower (CH9-S1)	Low
Full denormalization + updateMany	82s during propagation	High during updates	None	Medium
Version-based lazy update	Seconds (on next read)	Amortized across reads	Version check per read	Medium
Background propagation	Minutes to hours	Spread over time	None	Medium
Partial denormalization	Zero for immutable fields	None for immutable fields	$lookup for mutable fields	Low

The Trade-off

Sensor metadata changes are rare: a sensor might be relocated once per year. For rare changes, full denormalization with background propagation is the pragmatic choice. The 82-second propagation window is acceptable when it happens once a year.

For data that changes frequently (user profile names embedded in activity feeds, product prices embedded in order history), partial denormalization is safer. Embed the fields that never change (user ID, product SKU) and reference the fields that do (display name, current price).

The decision framework from the chapter introduction applies here. If reads outnumber writes by 100:1 or more, and the denormalized data changes less than once per day, full denormalization with background propagation is almost always the right choice for MongoDB.