Thread Pool Anatomy and Queue Sizing
Thread Pool Anatomy and Queue Sizing
The Symptom
The operations team sees rejected_execution_exception in the application logs during peak import hours. They increase the write thread pool queue size from 200 to 10,000. Rejections stop. Two days later, a node runs out of heap memory during a large import. The 10,000-item queue held 10,000 bulk requests, each containing 1,000 documents, consuming 8GB of heap.
The Internals
OpenSearch uses dedicated thread pools for different operation types:
| Thread Pool | Default Size | Default Queue | Purpose |
|---|---|---|---|
| write | #CPUs | 200 | Index, delete, update, bulk |
| search | 3/2 * #CPUs + 1 | 1,000 | Search queries |
| get | #CPUs | 1,000 | Get by ID |
| management | 5 | unlimited | Cluster management |
| refresh | #CPUs / 2 | unlimited | Segment refresh |
| flush | #CPUs / 2 | unlimited | Translog flush |
| force_merge | 1 | unlimited | Force merge operations |
When a thread pool is fully occupied and its queue is full, new requests are rejected immediately with rejected_execution_exception. This is a deliberate back-pressure mechanism. The rejection tells the client to slow down.
The circuit breaker system provides a second layer of protection. When incoming data would push heap usage past a threshold, the circuit breaker trips and rejects the request before the data is allocated. The parent circuit breaker defaults to 95% of heap.
The Implementation
Thread Pool Diagnostic
public class ThreadPoolDiagnostic {
private final OpenSearchClient client;
public ThreadPoolDiagnostic(OpenSearchClient client) {
this.client = client;
}
public record PoolHealth(
String poolName,
int active,
int size,
int queue,
int queueCapacity,
long rejected,
String status // GREEN, YELLOW, RED
) {}
public List<PoolHealth> diagnose() throws IOException {
var stats = client.nodes().stats(ns -> ns
.metric("thread_pool"));
List<PoolHealth> results = new ArrayList<>();
for (var node : stats.nodes().values()) {
for (var entry : node.threadPool().entrySet()) {
String name = entry.getKey();
var pool = entry.getValue();
String status;
if (pool.rejected() > 0 && pool.queue() >= pool.active()) {
status = "RED"; // Active rejections with full queue
} else if (pool.queue() > pool.active() * 2) {
status = "YELLOW"; // Queue is deep, approaching saturation
} else {
status = "GREEN";
}
results.add(new PoolHealth(
name,
pool.active(),
pool.size(),
pool.queue(),
pool.active(), // approximate capacity
pool.rejected(),
status
));
}
}
return results;
}
}
Circuit Breaker Monitoring
public record CircuitBreakerStatus(
String name,
long limitBytes,
long estimatedBytes,
double utilizationPercent,
long tripped
) {}
public List<CircuitBreakerStatus> getCircuitBreakerStatus() throws IOException {
var stats = client.nodes().stats(ns -> ns.metric("breaker"));
List<CircuitBreakerStatus> results = new ArrayList<>();
for (var node : stats.nodes().values()) {
for (var entry : node.breakers().entrySet()) {
var breaker = entry.getValue();
double utilization = breaker.limitSizeInBytes() > 0
? (double) breaker.estimatedSizeInBytes() /
breaker.limitSizeInBytes() * 100
: 0;
results.add(new CircuitBreakerStatus(
entry.getKey(),
breaker.limitSizeInBytes(),
breaker.estimatedSizeInBytes(),
utilization,
breaker.tripped()
));
}
}
return results;
}
The Measurement
Impact of queue sizing on rejection behavior and heap usage:
| Write Queue Size | Rejection Rate (500 doc/s) | Peak Heap Usage | Risk |
|---|---|---|---|
| 200 (default) | 2% during spikes | 65% | Low |
| 1,000 | 0% during spikes | 78% | Medium |
| 10,000 | 0% | 92%+ | High (OOM risk) |
Increasing the queue size from 200 to 1,000 eliminates most rejections with a modest heap increase. Increasing to 10,000 appears to eliminate all rejections but pushes heap usage dangerously close to the circuit breaker threshold, risking node instability.
The Decision Rule
Never increase a thread pool queue beyond 2x the default without understanding the root cause of rejections. Queue increases defer back-pressure signals, trading immediate rejections for deferred out-of-memory crashes.
Write pool rejections during bulk import indicate the cluster cannot sustain the write rate. The fix is client-side throttling (reduce batch size or concurrency), not server-side queue expansion.
Search pool rejections during normal traffic indicate insufficient search capacity. The fix is adding data nodes or replicas, not increasing the search queue. A longer queue means higher tail latency, not higher throughput.
Monitor circuit breaker trip counts alongside thread pool rejections. If both are increasing, the cluster is fundamentally undersized for the workload. No configuration change resolves this—only additional hardware.