Designing Covering Indexes for Common Query Patterns

The Symptom

The sensor list API returns sensor IDs and their latest reading timestamps. This endpoint is called 5,000 times per minute by the dashboard’s auto-refresh. Each call queries the readings collection to find the most recent timestamp per sensor. The p95 latency is 35ms, and the WiredTiger cache hit rate drops to 82% during peak dashboard hours because the FETCH stage loads full 340-byte documents into cache when the response only needs two fields (12 bytes).

The Cause

The query uses the existing {sensorId: 1, ts: -1} index but projects sensorId and ts without excluding _id:

// SLOW: Not covered because _id is included by default
List<Document> results = collection.find(Filters.eq("sensorId", sensorId))
    .sort(Sorts.descending("ts"))
    .projection(Projections.include("sensorId", "ts"))
    .limit(1)
    .into(new ArrayList<>());

MongoDB includes _id in projections by default. Since _id is not part of the {sensorId: 1, ts: -1} index, the query requires a FETCH.

The Benchmark

@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.MICROSECONDS)
@Warmup(iterations = 3, time = 5)
@Measurement(iterations = 5, time = 10)
@Fork(1)
@State(Scope.Benchmark)
public class CoveredQueryBenchmark {

    private MongoCollection<Document> collection;
    private List<String> sensorIds;

    @Setup
    public void setup() {
        MongoClient client = MongoClients.create("mongodb://localhost:27017");
        collection = client.getDatabase("telemetry").getCollection("readings");
        sensorIds = IntStream.range(0, 100)
            .mapToObj(i -> String.format("sensor-%05d", i))
            .toList();
    }

    @Benchmark
    public List<Document> notCovered_withId() {
        String sensorId = sensorIds.get(ThreadLocalRandom.current().nextInt(sensorIds.size()));
        return collection.find(Filters.eq("sensorId", sensorId))
            .sort(Sorts.descending("ts"))
            .projection(Projections.include("sensorId", "ts"))
            .limit(1)
            .into(new ArrayList<>());
    }

    @Benchmark
    public List<Document> covered_excludeId() {
        String sensorId = sensorIds.get(ThreadLocalRandom.current().nextInt(sensorIds.size()));
        return collection.find(Filters.eq("sensorId", sensorId))
            .sort(Sorts.descending("ts"))
            .projection(Projections.fields(
                Projections.include("sensorId", "ts"),
                Projections.excludeId()
            ))
            .limit(1)
            .into(new ArrayList<>());
    }

    @Benchmark
    public List<Document> covered_withTempField() {
        // Uses wider covering index {sensorId: 1, ts: -1, temperature: 1}
        String sensorId = sensorIds.get(ThreadLocalRandom.current().nextInt(sensorIds.size()));
        return collection.find(Filters.eq("sensorId", sensorId))
            .sort(Sorts.descending("ts"))
            .projection(Projections.fields(
                Projections.include("sensorId", "ts", "temperature"),
                Projections.excludeId()
            ))
            .limit(1)
            .into(new ArrayList<>());
    }
}

Results:

Benchmark                                       Mode  Cnt    Score    Error  Units
CoveredQueryBenchmark.notCovered_withId         avgt    5  320.000 ± 25.000  us/op
CoveredQueryBenchmark.covered_excludeId         avgt    5  120.000 ± 10.000  us/op
CoveredQueryBenchmark.covered_withTempField     avgt    5  135.000 ± 12.000  us/op

The covered query is 2.7x faster. Adding temperature to the covered projection adds only 15us because the ESR index {sensorId: 1, ts: -1, temperature: 1} from CH10 already includes it.

The Fix

For the top 5 queries, ensure coverage:

Query 1: Latest timestamp per sensor (5,000 calls/min)

// FAST: Covered by {sensorId: 1, ts: -1}
collection.find(Filters.eq("sensorId", sensorId))
    .sort(Sorts.descending("ts"))
    .projection(Projections.fields(
        Projections.include("sensorId", "ts"),
        Projections.excludeId()
    ))
    .limit(1);

Query 2: Sensor reading count per hour (500 calls/min)

// FAST: Covered by {sensorId: 1, ts: -1}
collection.countDocuments(Filters.and(
    Filters.eq("sensorId", sensorId),
    Filters.gte("ts", Date.from(hourStart)),
    Filters.lt("ts", Date.from(hourEnd))
));
// countDocuments() is automatically covered when filter fields match an index

Query 3: Alert check, temperature threshold (200 calls/min)

// FAST: Covered by {sensorId: 1, ts: -1, temperature: 1}
collection.find(Filters.and(
    Filters.eq("sensorId", sensorId),
    Filters.gt("temperature", threshold)
)).sort(Sorts.descending("ts"))
  .projection(Projections.fields(
      Projections.include("sensorId", "ts", "temperature"),
      Projections.excludeId()
  ))
  .limit(10);

The Proof

Query	Before (with FETCH)	After (covered)	Improvement
Latest timestamp	320us	120us	2.7x
Count per hour	1,200us	450us	2.7x
Alert check	850us	280us	3.0x
WiredTiger cache hit rate	82%	94%	+12 points

Cache hit rate improved because covered queries do not load documents into cache, leaving more room for the documents that actually need to be fetched.

The Trade-off

Covering indexes are wider (more fields), which means more storage and more memory consumption. The {sensorId: 1, ts: -1, temperature: 1} index is 33% larger than {sensorId: 1, ts: -1}. For 200 million documents, that is approximately 2.5 GB of additional index storage. This index must fit in the WiredTiger cache to be effective. If the covering index itself exceeds available cache, the covered query will trigger cache evictions for the index pages, negating the benefit.

The rule: create covering indexes only for high-frequency queries (thousands of calls per minute) where the covered fields are small. Do not create a covering index that includes String fields with high cardinality (long values) or multiple double fields. The index size will explode.