Testing Strategy in the Pipeline: Unit, Integration, Contract, and Where Each Lives
Testing Strategy in the Pipeline
Three categories of tests belong in a CI pipeline. Unit tests verify individual functions and classes without external dependencies. Integration tests verify that the service works correctly with its real dependencies (database, cache, message queue). Contract tests verify that the API a service exposes matches what its consumers expect.
End-to-end tests do not belong in the CI pipeline. They belong in a post-deployment verification step in the staging environment. They are slow, flaky, and test the integration of the entire platform, not a single service’s correctness. Running end-to-end tests before pushing an image wastes 15 minutes and blocks the pipeline on failures caused by other services.
The diagram shows the test pyramid overlaid with pipeline stages. At the base, unit tests run in the build job (fast, many, high coverage). In the middle, integration tests and contract tests run in parallel jobs after the build (moderate speed, focused coverage). At the top, performance tests (Locust) run as a post-deployment gate in staging. End-to-end tests sit above the pyramid, outside the CI pipeline, with a label: “Post-deploy verification only.”
The Failure
The checkout service pipeline runs all tests sequentially in a single job: unit tests (90 seconds), integration tests (4 minutes), and contract tests (2 minutes). Total: 6.5 minutes of testing, plus 45 seconds of Docker Compose startup for integration tests.
When a unit test fails, the developer waits 90 seconds to see the failure. When an integration test fails, they wait 90 seconds (unit tests) + 45 seconds (compose startup) + some integration test time before seeing the failure. When a contract test fails, they wait for everything before it.
The tests have no dependency on each other. Unit test results do not affect integration test execution. Running them in parallel cuts the feedback loop from 7 minutes to 4 minutes (the longest parallel branch).
The Mechanism
Pipeline Stage Placement
| Test Type | Pipeline Stage | Dependencies | Duration | Gate |
|---|---|---|---|---|
| Unit | Parallel job after build | None | 30s-2min | Blocks all downstream |
| Integration | Parallel job after build | Docker Compose (DB, Redis, dependencies) | 3-6min | Blocks promotion |
| Contract | Parallel job after build | Pact broker or mock server | 1-3min | Blocks promotion |
| Performance | Post-deploy job in staging | Running staging environment | 3-10min | Blocks prod promotion |
| End-to-end | Post-deploy verification | Running staging environment | 10-30min | Does not block |
Unit, integration, and contract tests run in parallel after the build job. Each gets its own runner. The promotion job depends on all three. Any failure blocks promotion.
Performance tests run after deployment to staging (covered in CH17). They test the service in a realistic environment with realistic infrastructure. They gate promotion from staging to production.
End-to-end tests run after deployment to staging as a verification step. They do not block the pipeline. They alert the team if the platform-wide integration is broken. If they are flaky, the team investigates. They never become a gate because their failure rate makes them unreliable as a promotion signal.
The Implementation
# HARDENED: Parallel test execution with matrix strategy
name: ci
on: [push, pull_request]
env:
IMAGE: ghcr.io/acme/checkout-service
jobs:
build:
runs-on: ubuntu-latest
outputs:
image-digest: ${{ steps.build.outputs.digest }}
steps:
- uses: actions/checkout@v4
- uses: docker/login-action@v3
with:
registry: ghcr.io
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}
- id: build
uses: docker/build-push-action@v6
with:
context: .
push: true
tags: ${{ env.IMAGE }}:${{ github.sha }}
cache-from: type=gha
cache-to: type=gha,mode=max
test:
runs-on: ubuntu-latest
needs: [build]
strategy:
fail-fast: false
matrix:
include:
- suite: unit
compose: false
timeout: 120
- suite: integration
compose: true
timeout: 600
- suite: contract
compose: false
timeout: 300
steps:
- uses: actions/checkout@v4
- name: Start dependencies
if: matrix.compose
run: |
docker compose -f docker-compose.test.yml up -d --wait
echo "Waiting for services to be healthy..."
sleep 5
- name: Run ${{ matrix.suite }} tests
timeout-minutes: ${{ matrix.timeout }}
run: |
docker run --rm \
${{ matrix.compose && '--network=host' || '' }} \
-e TEST_SUITE=${{ matrix.suite }} \
${{ env.IMAGE }}@${{ needs.build.outputs.image-digest }} \
./run-tests.sh ${{ matrix.suite }}
- name: Upload test results
if: always()
uses: actions/upload-artifact@v4
with:
name: test-results-${{ matrix.suite }}
path: test-results/
retention-days: 14
- name: Cleanup
if: matrix.compose && always()
run: docker compose -f docker-compose.test.yml down -v
scan:
runs-on: ubuntu-latest
needs: [build]
steps:
- uses: aquasecurity/trivy-action@master
with:
image-ref: ${{ env.IMAGE }}@${{ needs.build.outputs.image-digest }}
exit-code: 1
severity: CRITICAL,HIGH
promote:
runs-on: ubuntu-latest
needs: [test, scan]
if: github.ref == 'refs/heads/main'
steps:
- run: echo "All gates passed"
Contract Testing with Pact
The checkout service consumes APIs from three services: catalog (price lookup), inventory (reservation), and payments (charge). If any of these services changes their API in a backward-incompatible way, the checkout service breaks.
Contract tests verify that the API consumer’s expectations match what the provider actually returns. The consumer writes tests that define the expected request/response pairs. These expectations are published to a Pact broker. The provider runs verification tests against those expectations.
# checkout-service: Consumer-side contract test in CI
# Publishes pacts to the broker after running consumer tests
- name: Run contract tests
run: |
docker run --rm \
-e PACT_BROKER_URL=${{ secrets.PACT_BROKER_URL }} \
-e PACT_BROKER_TOKEN=${{ secrets.PACT_BROKER_TOKEN }} \
${{ env.IMAGE }}@${{ needs.build.outputs.image-digest }} \
./run-tests.sh contract
- name: Publish pacts
if: github.ref == 'refs/heads/main'
run: |
docker run --rm \
-e PACT_BROKER_URL=${{ secrets.PACT_BROKER_URL }} \
-e PACT_BROKER_TOKEN=${{ secrets.PACT_BROKER_TOKEN }} \
${{ env.IMAGE }}@${{ needs.build.outputs.image-digest }} \
./publish-pacts.sh --consumer-version=${{ github.sha }}
# catalog-service: Provider-side verification triggered by webhook
# When checkout publishes new pacts, the catalog service verifies them
on:
repository_dispatch:
types: [pact-verification]
jobs:
verify-pacts:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Verify pacts
run: |
docker compose -f docker-compose.test.yml up -d --wait
docker run --rm --network=host \
-e PACT_BROKER_URL=${{ secrets.PACT_BROKER_URL }} \
${{ env.IMAGE }}:latest \
./verify-pacts.sh --provider-version=${{ github.sha }}
The Gate
The promote job depends on all matrix variants of test and the scan job. The fail-fast: false setting ensures all test suites run to completion even if one fails. The developer sees all failures at once.
Contract test failures are particularly important gates because they catch backward-incompatible API changes before deployment. A broken contract between checkout and inventory means the checkout flow will fail in production. The contract test catches this at build time, not at deploy time.
The Recovery
Unit test failure: Fix the code. The test is the spec.
Integration test failure: Could be a code bug, a test environment issue, or a flaky test. Check if the test passes locally. If it fails consistently, fix the code. If it passes locally but fails in CI, the test environment differs (Docker Compose resource limits, network timing, startup ordering).
Contract test failure (consumer side): The consumer’s expectations are wrong. Update the consumer test to match the actual API. Or, if the consumer needs a new API feature, coordinate with the provider team.
Contract test failure (provider side): The provider made a backward-incompatible change. Revert the change or use the expand-contract pattern (CH9) to support both the old and new API versions.
Flaky test: Quarantine it. Move the test to a separate non-blocking job. Track it. Fix it within a defined time window (one sprint). If it cannot be fixed, delete it and replace it with a more reliable test.