Skip to main content

On This Page

Database Rate Limiting: The Missing Piece After a Circuit Breaker — Keep Your DB Alive Under Load

2 min read
Share

These articles are AI-generated summaries. Please check the original sources for full details.

The Database Becomes the New Bottleneck

Daksh Gargas describes a common outage scenario where a Redis cluster fails at 100K requests/sec. Without a database rate limiter, all 100K requests hit the database, which was designed to handle only 1K QPS.

Why This Matters

In production systems, circuit breakers prevent wasted calls to a failing dependency like Redis, but they do not protect the fallback database. If 100K requests/sec are redirected to a database built for 1K QPS, the database becomes the new bottleneck and can crash under load, taking the entire system down — even though the original failure was isolated.

Key Insights

  • A circuit breaker only decides whether to try the primary dependency (e.g., Redis); it does not rate-limit traffic to the fallback (e.g., database).
  • A DB rate limiter, typically a token bucket, sits inside the application right before the database call and caps allowed QPS (e.g., 500 QPS per app server).
  • Distributed rate limiting is essential when running multiple app servers (e.g., 10 servers × 500 QPS = 5,000 total) to avoid overwhelming the database.
  • Local caches can serve stale data as a second fallback if the database rate limiter rejects a request, reducing load further.

Working Examples

Example of an application-level DB rate limiter: after the Redis circuit breaker opens, the code checks a local cache and then a token-bucket rate limiter before hitting the database.

func GetUser(id string) User {
    if !redisCircuitBreaker.IsOpen() {
        return redis.Get(id)
    }
    if localCache.Has(id) {
        return localCache.Get(id)
    }
    if !dbRateLimiter.Allow() {
        return Error503()
    }
    return db.GetUser(id)
}

Practical Applications

  • Use case: Any service with a primary cache (e.g., Redis) and a fallback database should apply a per-instance rate limiter (e.g., 500 QPS) to protect the database during cache outages.
  • Pitfall: Relying solely on an API gateway for rate limiting fails because the gateway cannot know which endpoints hit the database or whether Redis is down.
  • Use case: Larger systems with 10+ app servers use distributed rate limiters or adaptive concurrency limits to ensure total fallback traffic stays within database capacity (e.g., 5,000 QPS).

References:

Continue reading

Next article

Prove AI Agent Output Integrity for $0.01: x402 + NEAR Anchoring in Practice

Related Content