Skip to main content
resilience patterns in production

The Financial Transaction Platform Architecture

4 min read Chapter 3 of 40

The Financial Transaction Platform Architecture

Every code example in this book lives within a single system. This section defines that system precisely so that subsequent chapters can reference services, latency targets, and failure modes without reintroduction.

Service Definitions

// Payment Service - The orchestrator
// Spring Boot 3 with RestClient (synchronous HTTP client)
@RestController
@RequestMapping("/api/payments")
public class PaymentController {

    private final FraudDetectionClient fraudClient;
    private final BalanceClient balanceClient;
    private final PaymentGateway paymentGateway;
    private final NotificationClient notificationClient;
    private final AuditClient auditClient;

    // Constructor injection omitted for brevity

    @PostMapping
    public ResponseEntity<PaymentResult> initiatePayment(
            @Valid @RequestBody PaymentRequest request) {
        // This is the call chain every chapter in this book protects
        FraudScore score = fraudClient.score(request);        // External dependency
        Balance balance = balanceClient.reserve(request);     // Database-backed
        PaymentConfirmation conf = paymentGateway.charge(request); // Payment processor
        notificationClient.notify(request.userId(), conf);    // Non-critical
        auditClient.log(request, conf);                       // Regulatory
        return ResponseEntity.ok(new PaymentResult(conf, score, balance));
    }
}

Latency Targets

Each service has a normal latency profile and a degraded latency profile. These numbers are referenced throughout the book when setting timeout values, circuit breaker thresholds, and bulkhead sizes.

ServiceNormal p50Normal p99Degraded p99Failure Mode
Fraud Detection40ms120ms5,000msExternal API slowdown
Balance Service15ms80ms2,000msDatabase lock contention
Payment Gateway200ms800ms15,000msThird-party processor delays
Notification100ms500ms30,000msSMTP/SMS provider issues
Audit Log10ms50ms500msDisk I/O saturation

The “Degraded p99” column is the number that matters for resilience configuration. Your timeouts must be shorter than these degraded latencies. Your circuit breaker must open before these latencies consume your thread pool.

Failure Mode Catalogue

Each failure mode in this list maps to a chapter:

  1. External scoring API becomes slow (Chapters 2, 4, 6): Fraud detection response times increase 100x. Threads pile up. Payment service thread pool exhausts.

  2. External scoring API becomes unreachable (Chapter 3, 4): TCP connections time out or are refused. Fraud detection must decide what to do without a score.

  3. Balance database enters lock contention (Chapter 2, 8): Queries that normally take 15ms take 2 seconds. Every payment blocks waiting for a balance check.

  4. Notification provider rate-limits the service (Chapter 7): The SMS provider returns HTTP 429. Sending more requests makes it worse.

  5. Audit log disk fills up (Chapter 6, 17): Writes fail. The service must decide whether to block the payment or buffer and continue.

  6. Payment gateway returns intermittent errors (Chapter 5): 10% of requests fail with HTTP 500, but retrying immediately succeeds 95% of the time.

  7. Multiple dependencies degrade simultaneously (Chapter 9, 17): Fraud is slow and notification is down. The payment service must handle compound failures.

  8. The payment service itself receives a traffic spike (Chapter 7, 19): A flash sale sends 10x normal traffic. The service must shed load without crashing.

Base Project Structure

payment-platform/
  payment-service/
    src/main/java/com/txn/payment/
      PaymentApplication.java
      controller/PaymentController.java
      client/FraudDetectionClient.java
      client/BalanceClient.java
      client/PaymentGatewayClient.java
      client/NotificationClient.java
      client/AuditClient.java
      config/ResilienceConfig.java
      config/RestClientConfig.java
      fallback/FraudFallback.java
      fallback/NotificationFallback.java
    src/main/resources/
      application.yml
    src/test/java/com/txn/payment/
      resilience/CircuitBreakerTest.java
      resilience/BulkheadTest.java
      resilience/RetryTest.java

The client package contains HTTP clients for each downstream service. Each client will be progressively wrapped with resilience patterns as the book progresses. The fallback package contains the degraded responses. The config package contains the Resilience4J configuration that ties it all together.

# application.yml - base configuration before resilience patterns
spring:
  application:
    name: payment-service

server:
  port: 8080
  tomcat:
    threads:
      max: 200
      min-spare: 10
    accept-count: 10 # Small queue, fast refusal
    connection-timeout: 5000 # 5s connection timeout

management:
  endpoints:
    web:
      exposure:
        include: health, metrics, prometheus
  metrics:
    tags:
      application: payment-service

This is the starting point. By the end of the book, the application.yml will include circuit breaker, retry, bulkhead, rate limiter, and time limiter configurations for every downstream dependency. Each configuration will have been derived from the failure modes and latency profiles defined in this section.