Designing Covering Indexes for Common Query Patterns
Designing Covering Indexes for Common Query Patterns
The Symptom
The sensor list API returns sensor IDs and their latest reading timestamps. This endpoint is called 5,000 times per minute by the dashboard’s auto-refresh. Each call queries the readings collection to find the most recent timestamp per sensor. The p95 latency is 35ms, and the WiredTiger cache hit rate drops to 82% during peak dashboard hours because the FETCH stage loads full 340-byte documents into cache when the response only needs two fields (12 bytes).
The Cause
The query uses the existing {sensorId: 1, ts: -1} index but projects sensorId and ts without excluding _id:
// SLOW: Not covered because _id is included by default
List<Document> results = collection.find(Filters.eq("sensorId", sensorId))
.sort(Sorts.descending("ts"))
.projection(Projections.include("sensorId", "ts"))
.limit(1)
.into(new ArrayList<>());
MongoDB includes _id in projections by default. Since _id is not part of the {sensorId: 1, ts: -1} index, the query requires a FETCH.
The Benchmark
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.MICROSECONDS)
@Warmup(iterations = 3, time = 5)
@Measurement(iterations = 5, time = 10)
@Fork(1)
@State(Scope.Benchmark)
public class CoveredQueryBenchmark {
private MongoCollection<Document> collection;
private List<String> sensorIds;
@Setup
public void setup() {
MongoClient client = MongoClients.create("mongodb://localhost:27017");
collection = client.getDatabase("telemetry").getCollection("readings");
sensorIds = IntStream.range(0, 100)
.mapToObj(i -> String.format("sensor-%05d", i))
.toList();
}
@Benchmark
public List<Document> notCovered_withId() {
String sensorId = sensorIds.get(ThreadLocalRandom.current().nextInt(sensorIds.size()));
return collection.find(Filters.eq("sensorId", sensorId))
.sort(Sorts.descending("ts"))
.projection(Projections.include("sensorId", "ts"))
.limit(1)
.into(new ArrayList<>());
}
@Benchmark
public List<Document> covered_excludeId() {
String sensorId = sensorIds.get(ThreadLocalRandom.current().nextInt(sensorIds.size()));
return collection.find(Filters.eq("sensorId", sensorId))
.sort(Sorts.descending("ts"))
.projection(Projections.fields(
Projections.include("sensorId", "ts"),
Projections.excludeId()
))
.limit(1)
.into(new ArrayList<>());
}
@Benchmark
public List<Document> covered_withTempField() {
// Uses wider covering index {sensorId: 1, ts: -1, temperature: 1}
String sensorId = sensorIds.get(ThreadLocalRandom.current().nextInt(sensorIds.size()));
return collection.find(Filters.eq("sensorId", sensorId))
.sort(Sorts.descending("ts"))
.projection(Projections.fields(
Projections.include("sensorId", "ts", "temperature"),
Projections.excludeId()
))
.limit(1)
.into(new ArrayList<>());
}
}
Results:
Benchmark Mode Cnt Score Error Units
CoveredQueryBenchmark.notCovered_withId avgt 5 320.000 ± 25.000 us/op
CoveredQueryBenchmark.covered_excludeId avgt 5 120.000 ± 10.000 us/op
CoveredQueryBenchmark.covered_withTempField avgt 5 135.000 ± 12.000 us/op
The covered query is 2.7x faster. Adding temperature to the covered projection adds only 15us because the ESR index {sensorId: 1, ts: -1, temperature: 1} from CH10 already includes it.
The Fix
For the top 5 queries, ensure coverage:
Query 1: Latest timestamp per sensor (5,000 calls/min)
// FAST: Covered by {sensorId: 1, ts: -1}
collection.find(Filters.eq("sensorId", sensorId))
.sort(Sorts.descending("ts"))
.projection(Projections.fields(
Projections.include("sensorId", "ts"),
Projections.excludeId()
))
.limit(1);
Query 2: Sensor reading count per hour (500 calls/min)
// FAST: Covered by {sensorId: 1, ts: -1}
collection.countDocuments(Filters.and(
Filters.eq("sensorId", sensorId),
Filters.gte("ts", Date.from(hourStart)),
Filters.lt("ts", Date.from(hourEnd))
));
// countDocuments() is automatically covered when filter fields match an index
Query 3: Alert check, temperature threshold (200 calls/min)
// FAST: Covered by {sensorId: 1, ts: -1, temperature: 1}
collection.find(Filters.and(
Filters.eq("sensorId", sensorId),
Filters.gt("temperature", threshold)
)).sort(Sorts.descending("ts"))
.projection(Projections.fields(
Projections.include("sensorId", "ts", "temperature"),
Projections.excludeId()
))
.limit(10);
The Proof
| Query | Before (with FETCH) | After (covered) | Improvement |
|---|---|---|---|
| Latest timestamp | 320us | 120us | 2.7x |
| Count per hour | 1,200us | 450us | 2.7x |
| Alert check | 850us | 280us | 3.0x |
| WiredTiger cache hit rate | 82% | 94% | +12 points |
Cache hit rate improved because covered queries do not load documents into cache, leaving more room for the documents that actually need to be fetched.
The Trade-off
Covering indexes are wider (more fields), which means more storage and more memory consumption. The {sensorId: 1, ts: -1, temperature: 1} index is 33% larger than {sensorId: 1, ts: -1}. For 200 million documents, that is approximately 2.5 GB of additional index storage. This index must fit in the WiredTiger cache to be effective. If the covering index itself exceeds available cache, the covered query will trigger cache evictions for the index pages, negating the benefit.
The rule: create covering indexes only for high-frequency queries (thousands of calls per minute) where the covered fields are small. Do not create a covering index that includes String fields with high cardinality (long values) or multiple double fields. The index size will explode.