BSON Allocation Profiling and Custom Codecs for Hot Paths
BSON Allocation Profiling and Custom Codecs for Hot Paths
The Symptom
The telemetry ingestion service exhibits 150ms GC pauses every 30 seconds under sustained 10,000 reads/sec load. The pauses correlate exactly with young generation collections. Async-profiler flame graphs show 62% of allocations originating from org.bson.codecs.DocumentCodec.decode().
The Cause
The default DocumentCodec creates a Document (backed by LinkedHashMap) for every result. Each entry in the map requires a Map.Entry object, a boxed key (String), and a boxed value. For numeric fields, double primitives become Double objects. For dates, epoch milliseconds become Date objects. None of this boxing is necessary if you control the deserialization.
The Benchmark
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.MICROSECONDS)
@Warmup(iterations = 3, time = 5)
@Measurement(iterations = 5, time = 10)
@Fork(value = 1, jvmArgs = {"-Xmx2g", "-XX:+UseG1GC"})
@State(Scope.Benchmark)
public class BsonDeserializationBenchmark {
private MongoCollection<Document> documentCollection;
private MongoCollection<TelemetryReading> codecCollection;
private Bson filter;
@Setup
public void setup() {
MongoClient client = MongoClients.create("mongodb://localhost:27017");
MongoDatabase db = client.getDatabase("telemetry");
documentCollection = db.getCollection("readings");
CodecRegistry codecRegistry = CodecRegistries.fromRegistries(
CodecRegistries.fromCodecs(new TelemetryReadingCodec()),
MongoClientSettings.getDefaultCodecRegistry()
);
codecCollection = db.getCollection("readings", TelemetryReading.class)
.withCodecRegistry(codecRegistry);
filter = Filters.eq("sensorId", "sensor-00042");
}
@Benchmark
public List<Document> defaultDocumentCodec() {
return documentCollection.find(filter).limit(100).into(new ArrayList<>());
}
@Benchmark
public List<TelemetryReading> customCodec() {
return codecCollection.find(filter).limit(100).into(new ArrayList<>());
}
}
The custom codec avoids the LinkedHashMap entirely:
// FAST: Custom Codec that deserializes directly to a record
public class TelemetryReadingCodec implements Codec<TelemetryReading> {
@Override
public TelemetryReading decode(BsonReader reader, DecoderContext context) {
reader.readStartDocument();
String sensorId = null;
long timestamp = 0;
double temperature = 0;
double humidity = 0;
double pressure = 0;
while (reader.readBsonType() != BsonType.END_OF_DOCUMENT) {
String fieldName = reader.readName();
switch (fieldName) {
case "sensorId" -> sensorId = reader.readString();
case "ts" -> timestamp = reader.readDateTime();
case "temp" -> temperature = reader.readDouble();
case "humidity" -> humidity = reader.readDouble();
case "pressure" -> pressure = reader.readDouble();
default -> reader.skipValue();
}
}
reader.readEndDocument();
return new TelemetryReading(
sensorId,
Instant.ofEpochMilli(timestamp),
temperature,
humidity,
pressure
);
}
@Override
public void encode(BsonWriter writer, TelemetryReading value, EncoderContext context) {
writer.writeStartDocument();
writer.writeString("sensorId", value.sensorId());
writer.writeDateTime("ts", value.timestamp().toEpochMilli());
writer.writeDouble("temp", value.temperature());
writer.writeDouble("humidity", value.humidity());
writer.writeDouble("pressure", value.pressure());
writer.writeEndDocument();
}
@Override
public Class<TelemetryReading> getEncoderClass() {
return TelemetryReading.class;
}
}
// The POJO: a Java record with primitive-friendly types
public record TelemetryReading(
String sensorId,
Instant timestamp,
double temperature,
double humidity,
double pressure
) {}
JMH results:
Benchmark Mode Cnt Score Error Units
BsonDeserializationBenchmark.defaultDocumentCodec avgt 5 427.000 ± 18.000 us/op
BsonDeserializationBenchmark.customCodec avgt 5 112.000 ± 5.000 us/op
The Fix
Register the custom codec on the hot path collection:
// FAST: Register custom codec for the telemetry collection
CodecRegistry registry = CodecRegistries.fromRegistries(
CodecRegistries.fromCodecs(new TelemetryReadingCodec()),
MongoClientSettings.getDefaultCodecRegistry()
);
MongoCollection<TelemetryReading> readings = database
.getCollection("readings", TelemetryReading.class)
.withCodecRegistry(registry);
// Reads now deserialize directly to TelemetryReading records
List<TelemetryReading> results = readings.find(
Filters.eq("sensorId", "sensor-00042")
).limit(100).into(new ArrayList<>());
The Proof
| Metric | Document codec | Custom codec |
|---|---|---|
| Deserialization time (100 docs) | 427 μs | 112 μs |
| Heap allocation per doc | 736 bytes | 184 bytes |
| Allocation rate (10K docs/sec) | 7.2 MB/sec | 1.8 MB/sec |
| GC pause frequency | Every 30s | Every 120s |
| GC pause duration (p99) | 150ms | 35ms |
The custom codec is 3.8x faster and allocates 4x less memory per document. GC pauses dropped from 150ms every 30 seconds to 35ms every 2 minutes.
The Trade-off
Custom codecs require manual maintenance. When the document schema changes (a new field is added, a field type changes), the codec must be updated. The default -> reader.skipValue() case in the switch statement provides forward compatibility for unknown fields, but removed or renamed fields require a codec update. This is engineering effort that the default DocumentCodec handles automatically. Use custom codecs only on hot paths where the allocation reduction is measurable. For low-throughput endpoints, the default codec is sufficient.