Testing Strategy in the Pipeline

Three categories of tests belong in a CI pipeline. Unit tests verify individual functions and classes without external dependencies. Integration tests verify that the service works correctly with its real dependencies (database, cache, message queue). Contract tests verify that the API a service exposes matches what its consumers expect.

End-to-end tests do not belong in the CI pipeline. They belong in a post-deployment verification step in the staging environment. They are slow, flaky, and test the integration of the entire platform, not a single service’s correctness. Running end-to-end tests before pushing an image wastes 15 minutes and blocks the pipeline on failures caused by other services.

Test pyramid with pipeline stage placement

The diagram shows the test pyramid overlaid with pipeline stages. At the base, unit tests run in the build job (fast, many, high coverage). In the middle, integration tests and contract tests run in parallel jobs after the build (moderate speed, focused coverage). At the top, performance tests (Locust) run as a post-deployment gate in staging. End-to-end tests sit above the pyramid, outside the CI pipeline, with a label: “Post-deploy verification only.”

The Failure

The checkout service pipeline runs all tests sequentially in a single job: unit tests (90 seconds), integration tests (4 minutes), and contract tests (2 minutes). Total: 6.5 minutes of testing, plus 45 seconds of Docker Compose startup for integration tests.

When a unit test fails, the developer waits 90 seconds to see the failure. When an integration test fails, they wait 90 seconds (unit tests) + 45 seconds (compose startup) + some integration test time before seeing the failure. When a contract test fails, they wait for everything before it.

The tests have no dependency on each other. Unit test results do not affect integration test execution. Running them in parallel cuts the feedback loop from 7 minutes to 4 minutes (the longest parallel branch).

The Mechanism

Pipeline Stage Placement

Test Type	Pipeline Stage	Dependencies	Duration	Gate
Unit	Parallel job after build	None	30s-2min	Blocks all downstream
Integration	Parallel job after build	Docker Compose (DB, Redis, dependencies)	3-6min	Blocks promotion
Contract	Parallel job after build	Pact broker or mock server	1-3min	Blocks promotion
Performance	Post-deploy job in staging	Running staging environment	3-10min	Blocks prod promotion
End-to-end	Post-deploy verification	Running staging environment	10-30min	Does not block

Unit, integration, and contract tests run in parallel after the build job. Each gets its own runner. The promotion job depends on all three. Any failure blocks promotion.

Performance tests run after deployment to staging (covered in CH17). They test the service in a realistic environment with realistic infrastructure. They gate promotion from staging to production.

End-to-end tests run after deployment to staging as a verification step. They do not block the pipeline. They alert the team if the platform-wide integration is broken. If they are flaky, the team investigates. They never become a gate because their failure rate makes them unreliable as a promotion signal.

The Implementation

# HARDENED: Parallel test execution with matrix strategy
name: ci
on: [push, pull_request]

env:
  IMAGE: ghcr.io/acme/checkout-service

jobs:
  build:
    runs-on: ubuntu-latest
    outputs:
      image-digest: ${{ steps.build.outputs.digest }}
    steps:
      - uses: actions/checkout@v4
      - uses: docker/login-action@v3
        with:
          registry: ghcr.io
          username: ${{ github.actor }}
          password: ${{ secrets.GITHUB_TOKEN }}
      - id: build
        uses: docker/build-push-action@v6
        with:
          context: .
          push: true
          tags: ${{ env.IMAGE }}:${{ github.sha }}
          cache-from: type=gha
          cache-to: type=gha,mode=max

  test:
    runs-on: ubuntu-latest
    needs: [build]
    strategy:
      fail-fast: false
      matrix:
        include:
          - suite: unit
            compose: false
            timeout: 120
          - suite: integration
            compose: true
            timeout: 600
          - suite: contract
            compose: false
            timeout: 300
    steps:
      - uses: actions/checkout@v4

      - name: Start dependencies
        if: matrix.compose
        run: |
          docker compose -f docker-compose.test.yml up -d --wait
          echo "Waiting for services to be healthy..."
          sleep 5

      - name: Run ${{ matrix.suite }} tests
        timeout-minutes: ${{ matrix.timeout }}
        run: |
          docker run --rm \
            ${{ matrix.compose && '--network=host' || '' }} \
            -e TEST_SUITE=${{ matrix.suite }} \
            ${{ env.IMAGE }}@${{ needs.build.outputs.image-digest }} \
            ./run-tests.sh ${{ matrix.suite }}

      - name: Upload test results
        if: always()
        uses: actions/upload-artifact@v4
        with:
          name: test-results-${{ matrix.suite }}
          path: test-results/
          retention-days: 14

      - name: Cleanup
        if: matrix.compose && always()
        run: docker compose -f docker-compose.test.yml down -v

  scan:
    runs-on: ubuntu-latest
    needs: [build]
    steps:
      - uses: aquasecurity/trivy-action@master
        with:
          image-ref: ${{ env.IMAGE }}@${{ needs.build.outputs.image-digest }}
          exit-code: 1
          severity: CRITICAL,HIGH

  promote:
    runs-on: ubuntu-latest
    needs: [test, scan]
    if: github.ref == 'refs/heads/main'
    steps:
      - run: echo "All gates passed"

Contract Testing with Pact

The checkout service consumes APIs from three services: catalog (price lookup), inventory (reservation), and payments (charge). If any of these services changes their API in a backward-incompatible way, the checkout service breaks.

Contract tests verify that the API consumer’s expectations match what the provider actually returns. The consumer writes tests that define the expected request/response pairs. These expectations are published to a Pact broker. The provider runs verification tests against those expectations.

# checkout-service: Consumer-side contract test in CI
# Publishes pacts to the broker after running consumer tests
- name: Run contract tests
  run: |
    docker run --rm \
      -e PACT_BROKER_URL=${{ secrets.PACT_BROKER_URL }} \
      -e PACT_BROKER_TOKEN=${{ secrets.PACT_BROKER_TOKEN }} \
      ${{ env.IMAGE }}@${{ needs.build.outputs.image-digest }} \
      ./run-tests.sh contract

- name: Publish pacts
  if: github.ref == 'refs/heads/main'
  run: |
    docker run --rm \
      -e PACT_BROKER_URL=${{ secrets.PACT_BROKER_URL }} \
      -e PACT_BROKER_TOKEN=${{ secrets.PACT_BROKER_TOKEN }} \
      ${{ env.IMAGE }}@${{ needs.build.outputs.image-digest }} \
      ./publish-pacts.sh --consumer-version=${{ github.sha }}

# catalog-service: Provider-side verification triggered by webhook
# When checkout publishes new pacts, the catalog service verifies them
on:
  repository_dispatch:
    types: [pact-verification]

jobs:
  verify-pacts:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - name: Verify pacts
        run: |
          docker compose -f docker-compose.test.yml up -d --wait
          docker run --rm --network=host \
            -e PACT_BROKER_URL=${{ secrets.PACT_BROKER_URL }} \
            ${{ env.IMAGE }}:latest \
            ./verify-pacts.sh --provider-version=${{ github.sha }}

The Gate

The promote job depends on all matrix variants of test and the scan job. The fail-fast: false setting ensures all test suites run to completion even if one fails. The developer sees all failures at once.

Contract test failures are particularly important gates because they catch backward-incompatible API changes before deployment. A broken contract between checkout and inventory means the checkout flow will fail in production. The contract test catches this at build time, not at deploy time.

The Recovery

Unit test failure: Fix the code. The test is the spec.

Integration test failure: Could be a code bug, a test environment issue, or a flaky test. Check if the test passes locally. If it fails consistently, fix the code. If it passes locally but fails in CI, the test environment differs (Docker Compose resource limits, network timing, startup ordering).

Contract test failure (consumer side): The consumer’s expectations are wrong. Update the consumer test to match the actual API. Or, if the consumer needs a new API feature, coordinate with the provider team.

Contract test failure (provider side): The provider made a backward-incompatible change. Revert the change or use the expand-contract pattern (CH9) to support both the old and new API versions.

Flaky test: Quarantine it. Move the test to a separate non-blocking job. Track it. Fix it within a defined time window (one sprint). If it cannot be fixed, delete it and replace it with a more reliable test.