Resilience in Reactive Pipelines
Resilience in Reactive Pipelines
Imperative resilience patterns wrap a synchronous call with decorators that catch exceptions, count failures, and impose timeouts. Reactive pipelines do not have synchronous calls. A Mono<FraudScore> is a description of a computation that has not executed yet. Decorating a Mono means transforming the reactive chain, not wrapping a method call. This changes how every pattern works.
The Execution Model Difference
In imperative code, the circuit breaker wraps a method call:
// Imperative: Thread blocks during execution
FraudScore score = circuitBreaker.executeSupplier(() ->
fraudClient.score(payment));
// Thread is consumed for the entire duration of the HTTP call
In reactive code, the circuit breaker transforms the pipeline:
// Reactive: Thread is released during I/O wait
Mono<FraudScore> score = fraudClient.scoreReactive(payment)
.transformDeferred(CircuitBreakerOperator.of(circuitBreaker));
// No thread is blocked. The circuit breaker checks state when the
// Mono is subscribed, and records the outcome when it completes.
The transformDeferred call does not execute anything. It returns a new Mono that, when subscribed, checks the circuit breaker state, forwards the subscription to the inner Mono if the breaker is closed, and records the outcome (success or failure) when the inner Mono completes.
This has three consequences:
Thread pool bulkheads are irrelevant. Reactive code does not consume a thread per request. A thread pool bulkhead limits concurrent threads, but reactive code uses event loop threads that handle thousands of concurrent requests. Semaphore bulkheads still work: they limit concurrent subscriptions regardless of the threading model.
Timeouts are operator-based. There is no Thread.sleep() or Future.get(timeout). The timeout is a Reactor operator (Mono.timeout()) that races the source Mono against a timer. If the timer fires first, the source is cancelled and a TimeoutException propagates through the reactive chain.
Retry is a resubscription. Imperative retry calls the method again. Reactive retry resubscribes to the Mono, which re-executes the entire reactive chain from the point of resubscription. If the chain includes upstream operators (map, filter, flatMap), those execute again.
Resilience4J Reactive Operators
Resilience4J provides reactive operators for all patterns:
<!-- PRODUCTION - Maven dependency -->
<dependency>
<groupId>io.github.resilience4j</groupId>
<artifactId>resilience4j-reactor</artifactId>
</dependency>
The operators are applied using transformDeferred:
// PRODUCTION - Full reactive resilience chain
@Service
public class ReactiveFraudService {
private final WebClient fraudWebClient;
private final CircuitBreaker circuitBreaker;
private final Retry retry;
private final Bulkhead bulkhead;
private final TimeLimiter timeLimiter;
public Mono<FraudScore> scoreFraud(PaymentRequest payment) {
return fraudWebClient.post()
.uri("/fraud/score")
.bodyValue(payment)
.retrieve()
.bodyToMono(FraudScore.class)
// Order matters: innermost to outermost
// (applied bottom-up, so first transformDeferred is innermost)
.transformDeferred(TimeLimiterOperator.of(timeLimiter))
.transformDeferred(BulkheadOperator.of(bulkhead))
.transformDeferred(CircuitBreakerOperator.of(circuitBreaker))
.transformDeferred(RetryOperator.of(retry));
}
}
Wait. The ordering appears reversed compared to the imperative version. In imperative composition, the outermost decorator is applied last (Retry wraps CircuitBreaker wraps Bulkhead). In reactive composition with transformDeferred, each call wraps the previous pipeline. The last transformDeferred (Retry) wraps everything above it, making it the outermost operator.
The execution order when subscribed:
- Retry (outermost): subscribes to the inner chain
- CircuitBreaker: checks state, forwards subscription if closed
- Bulkhead: acquires permit, forwards subscription
- TimeLimiter: starts timeout timer
- WebClient: sends HTTP request
On failure, the error propagates upward:
- TimeLimiter: if timeout, cancels the WebClient call
- Bulkhead: releases permit
- CircuitBreaker: records failure
- Retry: catches error, resubscribes (if retryable)
Reactive Circuit Breaker Specifics
// PRODUCTION - Circuit breaker with reactive fallback
public Mono<FraudScore> scoreFraudWithFallback(PaymentRequest payment) {
return fraudWebClient.post()
.uri("/fraud/score")
.bodyValue(payment)
.retrieve()
.bodyToMono(FraudScore.class)
.transformDeferred(CircuitBreakerOperator.of(circuitBreaker))
.onErrorResume(CallNotPermittedException.class, e ->
Mono.just(FraudScore.defaultPermit(payment)))
.onErrorResume(TimeoutException.class, e ->
Mono.just(FraudScore.defaultPermit(payment)));
}
The onErrorResume operator is the reactive equivalent of a catch block with fallback. CallNotPermittedException signals that the circuit breaker is open. TimeoutException signals that the call exceeded the time budget.
In reactive code, the fallback Mono.just(...) does not block a thread. It emits the fallback value on the same event loop thread that received the error signal.
Reactive Retry with Backoff
Reactor provides built-in retry operators that are more expressive than Resilience4J’s retry for reactive chains:
// PRODUCTION - Reactor retry with exponential backoff
public Mono<FraudScore> scoreFraudWithRetry(PaymentRequest payment) {
return fraudWebClient.post()
.uri("/fraud/score")
.bodyValue(payment)
.retrieve()
.bodyToMono(FraudScore.class)
.timeout(Duration.ofMillis(500))
.retryWhen(Retry.backoff(3, Duration.ofMillis(100))
.maxBackoff(Duration.ofSeconds(1))
.jitter(0.5)
.filter(throwable ->
throwable instanceof IOException ||
throwable instanceof TimeoutException)
.doBeforeRetry(signal ->
log.warn("Retry #{} for fraud scoring: {}",
signal.totalRetries(),
signal.failure().getMessage())))
.transformDeferred(CircuitBreakerOperator.of(circuitBreaker));
}
Retry.backoff creates a retry specification with exponential backoff and jitter. The filter predicate controls which exceptions trigger a retry. The doBeforeRetry callback logs each retry attempt. Reactor’s retry resubscribes to the entire chain from fraudWebClient.post(), meaning each retry creates a new HTTP request.
Reactive Timeout Semantics
// The difference between Mono.timeout() and TimeLimiterOperator
// Mono.timeout: cancels the upstream subscription
fraudWebClient.post()
.uri("/fraud/score")
.bodyValue(payment)
.retrieve()
.bodyToMono(FraudScore.class)
.timeout(Duration.ofMillis(500));
// When the timeout fires, Reactor sends a cancel signal upstream.
// WebClient receives the cancel signal and aborts the HTTP request.
// This is genuine cancellation: the connection is released.
// TimeLimiterOperator: wraps with a deadline
fraudWebClient.post()
.uri("/fraud/score")
.bodyValue(payment)
.retrieve()
.bodyToMono(FraudScore.class)
.transformDeferred(TimeLimiterOperator.of(timeLimiter));
// The TimeLimiter records the timeout in its metrics.
// The cancellation behavior depends on cancelRunningFuture config.
For reactive code, Mono.timeout() is the idiomatic choice. Use TimeLimiterOperator when you need the timeout to be recorded in Resilience4J metrics alongside other pattern metrics.
Testing Reactive Resilience
// PRODUCTION - StepVerifier test for circuit breaker behavior
@SpringBootTest
class ReactiveFraudServiceTest {
@Autowired
private ReactiveFraudService fraudService;
@MockBean
private WebClient fraudWebClient;
@Autowired
private CircuitBreakerRegistry cbRegistry;
@Test
void circuitBreakerOpens_returnsFallback() {
// Force circuit breaker open
CircuitBreaker cb = cbRegistry.circuitBreaker("fraudDetection");
cb.transitionToOpenState();
StepVerifier.create(
fraudService.scoreFraudWithFallback(samplePayment()))
.assertNext(score -> {
assertThat(score.isDefault()).isTrue();
assertThat(score.decision()).isEqualTo(Decision.PERMIT);
})
.verifyComplete();
}
@Test
void timeout_cancelsUpstream() {
// Simulate slow dependency
WebClient.RequestBodyUriSpec requestSpec = mock();
// ... configure mock to delay response by 5 seconds
StepVerifier.create(
fraudService.scoreFraudWithFallback(samplePayment()))
.assertNext(score ->
assertThat(score.isDefault()).isTrue())
.verifyComplete();
// Verify the circuit breaker recorded the timeout as a failure
CircuitBreaker cb = cbRegistry.circuitBreaker("fraudDetection");
assertThat(cb.getMetrics().getNumberOfFailedCalls())
.isGreaterThan(0);
}
}
StepVerifier is Reactor’s test harness. It subscribes to the Mono, asserts the emitted values, and verifies the completion signal. No thread blocking, no Thread.sleep(). The test is deterministic because the reactive chain executes synchronously when the source data is immediately available.