Pattern Composition
Pattern Composition
Individual resilience patterns are well-defined. Combining them incorrectly creates emergent behaviors that are worse than having no resilience at all. A retry outside a circuit breaker retries even when the breaker is open. A bulkhead inside a time limiter can reject a call that still has time budget remaining. A rate limiter outside a retry counts each retry against the rate limit, reducing the effective throughput of successful calls.
The order matters. There is one correct order for the common case, and deviations from it require explicit justification.
The Correct Order
The diagram shows the five resilience patterns as nested layers, outermost to innermost. Retry is outermost because it must retry the entire decorated chain: if the circuit breaker rejects a call, the retry should see that rejection and potentially retry later (after backoff) when the circuit breaker may be in half-open state. Circuit Breaker is second because it records the outcome of each attempt, including timeouts and bulkhead rejections. Rate Limiter is third because only calls that pass the circuit breaker should consume rate limiter tokens. Bulkhead is fourth because only calls that pass the rate limiter should compete for bulkhead permits. Time Limiter is innermost (closest to the actual call) because it enforces the timeout on the actual HTTP request.
The bottom warning shows what happens with wrong ordering: if the Circuit Breaker is outside Retry, retries bypass failure recording and the breaker never opens despite consistent failures.
Why This Order
Retry outside Circuit Breaker: When the circuit breaker opens, it throws CallNotPermittedException. The retry intercepts this exception. If the retry policy excludes CallNotPermittedException from retryable exceptions (and it should), the retry does not attempt again. The failure is recorded once in the circuit breaker, and the caller gets an immediate rejection.
If retry is inside the circuit breaker: each retry attempt is a separate call from the circuit breaker’s perspective. Three retries that all fail record three failures in the sliding window instead of one. The circuit breaker opens 3x faster than expected. This could be desirable in some cases, but the default should be one logical operation = one circuit breaker recording.
Circuit Breaker outside Rate Limiter: When the circuit breaker is open, calls are rejected without consuming a rate limiter token. If the circuit breaker were inside the rate limiter, rejected calls would still consume tokens, wasting rate limit capacity on calls that were never going to succeed.
Rate Limiter outside Bulkhead: The rate limiter controls the rate of calls. The bulkhead controls concurrency. Rate-limited calls that exceed the configured rate should never reach the bulkhead. If the bulkhead were outside the rate limiter, rejected bulkhead calls would be counted as rate-limited calls, creating confusion in metrics.
The Programmatic Composition
Resilience4J annotations apply decorators in a configurable order:
# PRODUCTION - application.yml
resilience4j:
# Global decorator order (outermost to innermost)
# This determines the order when multiple annotations are on the same method
circuitbreaker:
circuit-breaker-aspect-order: 1
retry:
retry-aspect-order: 2
ratelimiter:
rate-limiter-aspect-order: 3
bulkhead:
bulkhead-aspect-order: 4
timelimiter:
time-limiter-aspect-order: 5
Wait. The aspect order is the Spring AOP order, where lower numbers have higher priority and execute first (outermost). But the annotations are applied as aspects, and the outermost aspect executes first and is the last to return. For Resilience4J:
# PRODUCTION - Correct aspect ordering
# Lower number = higher priority = outermost
resilience4j:
retry:
retry-aspect-order: 1 # Outermost: retries everything
circuitbreaker:
circuit-breaker-aspect-order: 2
ratelimiter:
rate-limiter-aspect-order: 3
bulkhead:
bulkhead-aspect-order: 4
timelimiter:
time-limiter-aspect-order: 5 # Innermost: closest to the call
Alternatively, compose programmatically for full control:
// PRODUCTION - Programmatic composition
@Configuration
public class ResilienceConfig {
@Bean
public Supplier<FraudScore> resilientFraudCall(
CircuitBreakerRegistry cbRegistry,
RetryRegistry retryRegistry,
BulkheadRegistry bulkheadRegistry,
RateLimiterRegistry rlRegistry,
TimeLimiterRegistry tlRegistry,
FraudDetectionClient client) {
CircuitBreaker cb = cbRegistry.circuitBreaker("fraudDetection");
Retry retry = retryRegistry.retry("fraudDetection");
Bulkhead bulkhead = bulkheadRegistry.bulkhead("fraudDetection");
// Compose: Retry -> CircuitBreaker -> Bulkhead -> actual call
// (RateLimiter and TimeLimiter applied separately)
return Decorators.ofSupplier(() -> client.score(new PaymentRequest()))
.withBulkhead(bulkhead)
.withCircuitBreaker(cb)
.withRetry(retry)
.decorate();
// Reading bottom-up: Bulkhead wraps the call,
// CircuitBreaker wraps that, Retry wraps everything
}
}
Complete Configuration Per Dependency
# PRODUCTION - Full resilience configuration for all dependencies
resilience4j:
# --- Fraud Detection ---
circuitbreaker:
instances:
fraudDetection:
sliding-window-size: 100
failure-rate-threshold: 50
slow-call-rate-threshold: 80
slow-call-duration-threshold: 500ms
minimum-number-of-calls: 20
wait-duration-in-open-state: 60s
permitted-number-of-calls-in-half-open-state: 5
automatic-transition-from-open-to-half-open-enabled: true
retry:
instances:
fraudDetection:
max-attempts: 1 # No retries for fraud - circuit breaker handles it
# Fraud detection calls are idempotent (scoring is read-only),
# but retrying a slow service delays the payment further.
# The circuit breaker + fallback is the correct strategy.
bulkhead:
instances:
fraudDetection:
max-concurrent-calls: 20
max-wait-duration: 100ms
timelimiter:
instances:
fraudDetection:
timeout-duration: 2s
cancel-running-future: true
# --- Payment Gateway ---
circuitbreaker:
instances:
paymentGateway:
sliding-window-size: 50
failure-rate-threshold: 60
# Higher threshold: payment gateway errors are more common
# (network issues with external provider)
minimum-number-of-calls: 10
wait-duration-in-open-state: 30s
permitted-number-of-calls-in-half-open-state: 3
retry:
instances:
paymentGateway:
max-attempts: 3
wait-duration: 200ms
enable-exponential-backoff: true
exponential-backoff-multiplier: 2.0
enable-randomized-wait: true
retry-exceptions:
- java.io.IOException
- java.util.concurrent.TimeoutException
bulkhead:
instances:
paymentGateway:
max-concurrent-calls: 120
max-wait-duration: 200ms
timelimiter:
instances:
paymentGateway:
timeout-duration: 8s
cancel-running-future: true
Testing the Composed Stack
// PRODUCTION - Test verifying the full decorator chain
@SpringBootTest(webEnvironment = SpringBootTest.WebEnvironment.RANDOM_PORT)
@Testcontainers
class ComposedResilienceTest {
@Container
static GenericContainer<?> wireMock = new GenericContainer<>(
DockerImageName.parse("wiremock/wiremock:latest"))
.withExposedPorts(8080);
@Autowired
private FraudDetectionService fraudService;
@Autowired
private CircuitBreakerRegistry cbRegistry;
@Autowired
private BulkheadRegistry bulkheadRegistry;
@Test
void fullChain_circuitBreakerOpens_bulkheadNotConsumed() {
// Cause circuit breaker to open
forceCircuitBreakerOpen();
// Verify bulkhead permits are NOT consumed when circuit breaker is open
Bulkhead bulkhead = bulkheadRegistry.bulkhead("fraudDetection");
int permitsBefore = bulkhead.getMetrics().getAvailableConcurrentCalls();
try {
fraudService.checkFraud(samplePayment());
} catch (Exception ignored) {}
int permitsAfter = bulkhead.getMetrics().getAvailableConcurrentCalls();
// Permits should be unchanged: CB rejected before reaching bulkhead
assertThat(permitsAfter).isEqualTo(permitsBefore);
}
}
This test verifies that the decorator ordering is correct. When the circuit breaker is open, the call is rejected before reaching the bulkhead. No bulkhead permit is consumed. If the ordering were wrong (bulkhead outside circuit breaker), the permit would be acquired and immediately released, adding unnecessary overhead and potentially causing false bulkhead-full rejections under high concurrency.