Artifact Passing and Cache Strategy

The Failure

The frontend shell pipeline builds a Next.js application in the build job and needs the output in the deploy job. The team passes the build output using actions/upload-artifact and actions/download-artifact. The build output is 180 MB. Uploading takes 45 seconds. Downloading takes 30 seconds. This happens on every pipeline run.

Meanwhile, the same pipeline caches node_modules with a key of node-modules-v1. The cache was created three months ago. Dependencies have changed seven times since then. The cache still hits because the key never changes. The pipeline restores stale dependencies and runs npm install on top of them. Sometimes this works. Sometimes it produces a node_modules directory that differs from a clean npm ci. The team does not know which builds used stale dependencies because the cache key does not encode the dependency state.

The Mechanism

GitHub Actions provides three mechanisms for passing data between jobs:

Job outputs ($GITHUB_OUTPUT). Small string values: image tags, version numbers, SHA digests, URLs. Passed through workflow syntax: ${{ needs.build.outputs.image-tag }}. Maximum 1 MB per output. Use this for metadata.

Artifacts (actions/upload-artifact, actions/download-artifact). Files and directories. Uploaded to GitHub’s artifact storage, available to downstream jobs and after the workflow completes. Retained for 90 days by default. Use this for build outputs, test reports, and SBOM files.

Caches (actions/cache). Files and directories. Keyed by a string. Restored at the start of a job if the key matches. Saved at the end of the job if no cache existed for the key. Scoped to the branch (with fallback to the default branch). Evicted after 7 days of inactivity or when the 10 GB per-repository limit is reached. Use this for dependencies and build tool caches that are expensive to recreate.

The distinction: artifacts carry data forward in the current pipeline run. Caches carry data forward across pipeline runs. Outputs carry small values between jobs without storage overhead.

The Implementation

Job Outputs for Metadata

# HARDENED: Image tag passed as job output, not hardcoded in downstream jobs
jobs:
  build:
    runs-on: ubuntu-latest
    outputs:
      image-tag: ${{ steps.meta.outputs.version }}
      image-digest: ${{ steps.build.outputs.digest }}
    steps:
      - uses: actions/checkout@v4

      - name: Image metadata
        id: meta
        uses: docker/metadata-action@v5
        with:
          images: ghcr.io/acme/frontend-shell
          tags: type=sha,prefix=,format=short

      - name: Build and push
        id: build
        uses: docker/build-push-action@v6
        with:
          context: .
          push: true
          tags: ${{ steps.meta.outputs.tags }}

  deploy:
    needs: [build]
    runs-on: ubuntu-latest
    steps:
      - name: Use image reference
        run: |
          echo "Deploying ghcr.io/acme/frontend-shell:${{ needs.build.outputs.image-tag }}"
          echo "Digest: ${{ needs.build.outputs.image-digest }}"

Artifacts for Test Reports

# HARDENED: Test reports uploaded as artifacts for debugging and audit
jobs:
  test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Run tests
        run: npm run test -- --reporter=junit --output-file=test-results.xml
        continue-on-error: true
        id: tests

      - name: Upload test results
        if: always()
        uses: actions/upload-artifact@v4
        with:
          name: test-results
          path: test-results.xml
          retention-days: 30

      - name: Fail if tests failed
        if: steps.tests.outcome == 'failure'
        run: exit 1

The pattern: run tests with continue-on-error: true on the test step so the artifact upload always runs, then explicitly fail in a subsequent step if tests failed. This ensures test reports are always available for debugging, even on failure.

Dependency Caching with Hash-Based Keys

# FRAGILE: Static cache key, never invalidated
- uses: actions/cache@v4
  with:
    path: node_modules
    key: node-modules-v1

# HARDENED: Hash-based key tied to lock file content
- uses: actions/cache@v4
  with:
    path: |
      node_modules
      ~/.npm
    key: deps-${{ runner.os }}-${{ hashFiles('package-lock.json') }}
    restore-keys: |
      deps-${{ runner.os }}-

The hashFiles('package-lock.json') function computes a SHA-256 of the lock file. When dependencies change, the lock file changes, the hash changes, and the cache misses. A clean npm ci runs and creates a new cache entry. The restore-keys fallback restores the most recent cache for the same OS, which gives npm ci a partial cache to work from.

Docker Layer Caching

# HARDENED: GitHub Actions cache backend for Docker layers
- name: Build and push
  uses: docker/build-push-action@v6
  with:
    context: .
    push: true
    tags: ${{ env.IMAGE }}:${{ github.sha }}
    cache-from: type=gha
    cache-to: type=gha,mode=max

The type=gha cache backend stores Docker build layers in the GitHub Actions cache. mode=max caches all layers, not just the final stage. On subsequent builds, unchanged layers are restored from cache instead of rebuilt. This reduces build time for the checkout service from 3 minutes to 45 seconds when only application code changes (the dependency installation layer is cached).

The Gate

The cache itself is not a gate. But a poisoned cache can bypass gates. If a cache contains compromised dependencies from a previous build, and the cache key does not change when dependencies change, the build uses compromised code without triggering a dependency scan.

The gate is the cache key design. A key that includes hashFiles('package-lock.json') ensures the cache is invalidated whenever dependencies change. The dependency scan in the downstream job runs against the actual resolved dependencies, not a stale cache.

The Recovery

When a cache is suspected of being poisoned (e.g., after a supply chain incident affecting a dependency that was cached), delete the cache entries for the affected repository:

# List caches for the repository
gh cache list --repo acme/checkout-service

# Delete a specific cache by key
gh cache delete "deps-Linux-abc123def456" --repo acme/checkout-service

# Nuclear option: delete all caches
gh cache list --repo acme/checkout-service --json key --jq '.[].key' | \
  xargs -I {} gh cache delete {} --repo acme/checkout-service

The next pipeline run creates fresh caches from clean dependency resolution. Chapter 4 covers how to detect compromised dependencies before they enter the cache through SBOM generation and vulnerability scanning.

Cache Key Design Rules

Data Type	Cache Key Pattern	Invalidation Trigger
Node.js dependencies	`deps-${{ runner.os }}-${{ hashFiles('package-lock.json') }}`	Any dependency change
Go modules	`go-${{ runner.os }}-${{ hashFiles('go.sum') }}`	Any module change
Docker layers	`type=gha` (managed by BuildKit)	Any layer input change
Gradle build cache	`gradle-${{ runner.os }}-${{ hashFiles('*/.gradle', '*/gradle-wrapper.properties') }}`	Any build script change

Every cache key includes the runner OS (caches are not portable across operating systems) and a hash of the file that defines the cached content. Never use a static key. Never use a timestamp. Never use the branch name alone (multiple commits on the same branch should not share a cache if dependencies changed between them).