$lookup Performance: Measuring the Join Tax
$lookup Performance: Measuring the Join Tax
The Symptom
The dashboard endpoint that shows sensor readings with metadata has a p95 latency of 280ms. The MongoDB explain output shows the $lookup stage consuming 65% of the total execution time. The sensor metadata collection has 10,000 documents with an index on _id. The lookup should be fast. Why is it not?
The Cause
MongoDB $lookup behavior depends on the result set size. For each document in the pipeline, the $lookup stage performs a lookup against the foreign collection. With 500 documents in the pipeline, that is 500 lookups. MongoDB 5.0 introduced hash join optimization for equality-based lookups, which builds a hash table from the foreign collection in memory. But the hash table is rebuilt for each aggregation execution, not cached across queries.
// explain output for the $lookup aggregation
{
"stages": [
{
"$cursor": {
"executionStats": {
"nReturned": 500,
"executionTimeMillisEstimate": 15
}
}
},
{
"$lookup": {
"executionStats": {
"nReturned": 500,
"executionTimeMillisEstimate": 180
}
}
},
{
"$unwind": {
"executionStats": {
"executionTimeMillisEstimate": 5
}
}
}
]
}
The $cursor stage (finding the readings) takes 15ms. The $lookup stage takes 180ms. 92% of the query time is the join.
The Benchmark
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.MILLISECONDS)
@Warmup(iterations = 2, time = 10)
@Measurement(iterations = 3, time = 15)
@Fork(1)
@State(Scope.Benchmark)
public class LookupVsDenormalizedBenchmark {
private MongoCollection<Document> normalizedReadings;
private MongoCollection<Document> denormalizedReadings;
@Param({"10", "50", "100", "500", "1000"})
private int resultSize;
@Setup
public void setup() {
MongoClient client = MongoClients.create("mongodb://localhost:27017");
var db = client.getDatabase("telemetry");
normalizedReadings = db.getCollection("readings_normalized");
denormalizedReadings = db.getCollection("readings_denormalized");
}
@Benchmark
public List<Document> withLookup() {
return normalizedReadings.aggregate(List.of(
Aggregates.match(Filters.eq("sensorId", "sensor-00042")),
Aggregates.sort(Sorts.descending("ts")),
Aggregates.limit(resultSize),
Aggregates.lookup("sensors", "sensorId", "_id", "sensor"),
Aggregates.unwind("$sensor"),
Aggregates.project(Projections.fields(
Projections.include("sensorId", "ts", "temperature"),
Projections.computed("building", "$sensor.building"),
Projections.computed("floor", "$sensor.floor")
))
)).into(new ArrayList<>());
}
@Benchmark
public List<Document> denormalized() {
return denormalizedReadings.find(
Filters.eq("sensorId", "sensor-00042")
).sort(Sorts.descending("ts"))
.limit(resultSize)
.projection(Projections.include("sensorId", "ts", "temperature",
"sensor.building", "sensor.floor"))
.into(new ArrayList<>());
}
}
Results:
Benchmark (resultSize) Mode Cnt Score Error Units
LookupVsDenormalizedBenchmark.withLookup 10 avgt 3 12.000 ± 2.000 ms/op
LookupVsDenormalizedBenchmark.withLookup 50 avgt 3 35.000 ± 4.000 ms/op
LookupVsDenormalizedBenchmark.withLookup 100 avgt 3 68.000 ± 8.000 ms/op
LookupVsDenormalizedBenchmark.withLookup 500 avgt 3 280.000 ± 20.000 ms/op
LookupVsDenormalizedBenchmark.withLookup 1000 avgt 3 520.000 ± 35.000 ms/op
LookupVsDenormalizedBenchmark.denormalized 10 avgt 3 2.000 ± 0.300 ms/op
LookupVsDenormalizedBenchmark.denormalized 50 avgt 3 4.000 ± 0.500 ms/op
LookupVsDenormalizedBenchmark.denormalized 100 avgt 3 6.000 ± 0.800 ms/op
LookupVsDenormalizedBenchmark.denormalized 500 avgt 3 15.000 ± 2.000 ms/op
LookupVsDenormalizedBenchmark.denormalized 1000 avgt 3 28.000 ± 3.000 ms/op
At 500 results, $lookup is 18.7x slower. The ratio increases with result size because each additional document adds a lookup operation. The denormalized approach scales linearly with I/O.
The Fix
Denormalize the sensor metadata into reading documents at write time:
// FAST: Embed sensor metadata at write time
public void ingestReading(SensorReading reading) {
// Cache sensor metadata (refreshed every 5 minutes)
Document sensorMeta = sensorMetadataCache.get(reading.getSensorId());
Document doc = new Document()
.append("sensorId", reading.getSensorId())
.append("ts", Date.from(reading.getTimestamp()))
.append("temperature", reading.getTemperature())
.append("humidity", reading.getHumidity())
.append("pressure", reading.getPressure())
.append("sensor", new Document()
.append("building", sensorMeta.getString("building"))
.append("floor", sensorMeta.getInteger("floor"))
.append("zone", sensorMeta.getString("zone"))
.append("type", sensorMeta.getString("type"))
);
collection.insertOne(doc);
}
The Proof
After denormalization:
| Metric | $lookup | Denormalized |
|---|---|---|
| Dashboard p50 (500 docs) | 145ms | 10ms |
| Dashboard p95 (500 docs) | 280ms | 18ms |
| Storage per reading | 340 bytes | 420 bytes |
| Storage overhead | Baseline | +23.5% |
| Sensor metadata update cost | Zero (single source) | Bulk update all readings |
The Trade-off
The 23.5% storage increase from embedding 80 bytes of sensor metadata in every reading adds up. For 200 million readings per month, that is 16 GB of additional storage. For the telemetry platform with a 2 TB storage budget, 16 GB is negligible.
The real cost is consistency. If sensor-00042 moves from Floor 3 to Floor 5, historical readings still show Floor 3. This may be correct (the reading was taken when the sensor was on Floor 3) or incorrect (the dashboard should show current location). If historical accuracy matters, denormalization is the correct choice. If current-state accuracy matters, use $lookup or maintain a version field in the embedded metadata and update asynchronously.