Skip to main content

On This Page

Why Queues Don’t Fix Overload: The Physics of Backpressure and Load Shedding

3 min read
Share

These articles are AI-generated summaries. Please check the original sources for full details.

Why Queues Don’t Fix Overload (And What To Do Instead)

Engineer Peter Mbanugo explains that software systems are bound by physical laws where larger buffers only delay inevitable flooding. Little’s Law proves that if arrival rates exceed processing capacity, the queue length will grow toward infinity until the system crashes.

Why This Matters

In technical reality, engineering teams often use Kafka or RabbitMQ as band-aids for traffic spikes, but queues only absorb variance, not sustained load. When arrival rates exceed processing power, the system enters a latency death spiral where expensive CPU cycles are wasted on dead requests that users have already abandoned, leading to unrecoverable catastrophic failure.

Key Insights

  • Little’s Law ($L = \lambda W$) states that the number of items in a system equals the arrival rate multiplied by average processing time.
  • Fred Hebert famously stated that ‘Queues don’t fix overload,’ as they cannot solve the problem of a sustained faucet flowing faster than a drain.
  • The Tina framework, built in Odin, implements a thread-per-core architecture where all resources like mailboxes are strictly bounded and pre-allocated at boot.
  • The ‘Latency Death Spiral’ occurs when increased response times lead to queuing, causing further delays and eventual total system saturation.
  • Tina Isolates use O(1) fast rejection to immediately signal backpressure when mailboxes reach their default 256-message limit.

Working Examples

Handling synchronous message results in Tina to force explicit load shedding decisions.

result := tina.ctx_send(ctx, destination_handle, TAG_DATA, &payload)
#partial switch result {
case .ok:
// Message successfully enqueued.
return tina.Effect_Receive{}
case .mailbox_full:
// The destination is overwhelmed. We must shed load.
tina.ctx_log(ctx, .WARN, TAG_OVERLOAD, "Destination overloaded, dropping request.")
// We explicitly drop the work and wait for the next message.
return tina.Effect_Receive{}
case .pool_exhausted:
// The Shard's memory pool is fully saturated. Let it crash.
return tina.Effect_Crash{reason = .system_saturated}
}

The .call pattern in Tina using mandatory timeouts to ensure bounded reliability.

// Send a request and park the Isolate until a reply arrives.
return tina.Effect_Call{
to = billing_handle,
message = transform_request_to_message(request),
timeout = 5000, // Mandatory timeout in milliseconds
}

Practical Applications

  • Use Case: High-throughput telemetry systems where dropping data via ‘fire-and-forget’ is preferable to crashing the entire service. Pitfall: Using unbounded mailboxes that consume all system memory during a network partition.
  • Use Case: Billing services utilizing the .call pattern with mandatory timeouts to maintain SLAs. Pitfall: Retrying requests without exponential backoff or feedback loops, which increases arrival rates and worsens congestion.
  • Use Case: Thread-per-core architectures like Tina that use zero-allocation state machines to ensure deterministic performance under load. Pitfall: Dynamic memory allocation (malloc) during high-traffic operations leading to garbage collection pauses.

References:

Continue reading

Next article

Automating OTP Extraction with Nylas CLI Workflow Utilities

Related Content