CDN Cache Partitioning and Edge Configuration
CDN Cache Partitioning and Edge Configuration
The Symptom
The e-commerce platform deployed a CDN with default settings. The expected cache hit rate for static assets was 90%+. The actual cache hit rate was 62%. The CDN is caching resources, but a third of requests miss the cache despite the assets being the same immutable files.
The Cause
Three factors reduce CDN cache hit rates below the theoretical maximum:
-
Vary header fragmentation: The origin server sends
Vary: Accept-Encoding, which causes the CDN to store separate cached copies for eachAccept-Encodingvalue the client sends. Most browsers sendAccept-Encoding: gzip, deflate, br, but some older browsers sendAccept-Encoding: gzip, deflate. These are different cache keys, so the CDN stores two copies of the same file and some requests miss because the exact encoding combination was not cached. -
Query string variation: Marketing campaigns append tracking parameters (
?utm_source=email&utm_campaign=spring). If the CDN includes query strings in the cache key, each unique combination creates a separate cache entry. One URL with 20 different tracking parameter combinations creates 20 cached copies. -
Geographic distribution: A CDN has dozens of edge locations. Each edge maintains its own cache. A user in Frankfurt hits the Frankfurt edge. A user in Mumbai hits the Mumbai edge. If a resource was cached in Frankfurt but not yet requested from Mumbai, the Mumbai request misses. Low-traffic edge locations have lower cache hit rates because the cache is populated by demand.
The Baseline
CDN cache analysis for one day:
| Miss Reason | % of Total Misses |
|---|---|
| First request to edge (cold cache) | 41% |
| Query string variation | 28% |
| Vary header fragmentation | 18% |
| Cache TTL expiration | 13% |
28% of cache misses come from query string variation alone. These are preventable.
The Fix
Configure the CDN to ignore irrelevant query parameters:
# Cloudflare Page Rule or Cache Rule
Cache Key:
Query String: Ignore All (for static assets)
# Or selectively ignore marketing parameters:
Cache Key:
Query String: Ignore specified
- utm_source
- utm_medium
- utm_campaign
- utm_content
- utm_term
- fbclid
- gclid
For CDNs that support cache key customization (Fastly VCL, Cloudflare Workers):
// Cloudflare Worker: Normalize cache key
export default {
async fetch(request: Request): Promise<Response> {
const url = new URL(request.url);
// Strip marketing query parameters from cache key
const marketingParams = [
"utm_source",
"utm_medium",
"utm_campaign",
"utm_content",
"utm_term",
"fbclid",
"gclid",
];
for (const param of marketingParams) {
url.searchParams.delete(param);
}
// Create a cache key from the normalized URL
const cacheKey = new Request(url.toString(), request);
const cache = caches.default;
let response = await cache.match(cacheKey);
if (!response) {
response = await fetch(request);
// Only cache successful responses
if (response.ok) {
const headers = new Headers(response.headers);
// Normalize Vary header to prevent fragmentation
if (isStaticAsset(url.pathname)) {
headers.set("Vary", "Accept-Encoding");
headers.delete("Vary"); // Remove Vary for immutable assets
}
response = new Response(response.body, {
status: response.status,
statusText: response.statusText,
headers,
});
// Cache the response with the normalized key
await cache.put(cacheKey, response.clone());
}
}
return response;
},
};
function isStaticAsset(pathname: string): boolean {
return /\.[a-f0-9]{8}\.(js|css|woff2|avif|webp|jpg|png|svg)$/.test(pathname);
}
For Vary header handling, the ideal configuration for hashed assets:
# Static hashed assets: no Vary needed
# The hash in the filename IS the cache key variant
location ~* \.[a-f0-9]{8}\.(js|css|woff2|avif|webp|jpg|png|svg)$ {
add_header Cache-Control "public, max-age=31536000, immutable";
# Serve pre-compressed files based on Accept-Encoding
# CDN handles encoding negotiation
brotli_static on;
gzip_static on;
}
Debugging cache misses using response headers:
// Check cache status headers for a resource
async function checkCacheStatus(url: string): Promise<void> {
const response = await fetch(url);
const headers: Record<string, string | null> = {
"cf-cache-status": response.headers.get("cf-cache-status"),
"x-cache": response.headers.get("x-cache"),
age: response.headers.get("age"),
"cache-control": response.headers.get("cache-control"),
vary: response.headers.get("vary"),
};
console.log(`URL: ${url}`);
for (const [key, value] of Object.entries(headers)) {
if (value) {
console.log(` ${key}: ${value}`);
}
}
// Interpretation:
// cf-cache-status: HIT = served from Cloudflare cache
// cf-cache-status: MISS = fetched from origin
// cf-cache-status: EXPIRED = cache entry existed but TTL expired
// cf-cache-status: DYNAMIC = not eligible for caching
// age: seconds since the response was cached
}
The Proof
After CDN configuration optimization:
| Metric | Before | After | Delta |
|---|---|---|---|
| Overall cache hit rate | 62% | 91% | +29pp |
| JS bundle hit rate | 78% | 97% | +19pp |
| Image hit rate | 58% | 89% | +31pp |
| Origin requests/day | 285,000 | 67,500 | -76% |
| p75 LCP (all users) | 3.2s | 2.6s | -600ms |
The image hit rate improvement (31pp) came primarily from query string normalization. Marketing emails linked to product pages with tracking parameters, and each unique parameter combination was a cache miss for the product images on that page.
The LCP improvement of 600ms comes from higher cache hit rates reducing the average TTFB across all users. When a resource is served from CDN cache, the TTFB is 20-50ms (edge latency). When it misses, the TTFB includes the origin round trip (200-800ms depending on geography).
The Trade-off
Stripping query parameters from the cache key means that legitimate parameters (like pagination, sort order, or filtering) must be handled carefully. If the CDN strips a ?page=2 parameter from the cache key, all pagination requests return the same cached page-1 response.
The rule: strip only known-inert parameters (marketing trackers) and preserve parameters that affect the response content. This requires maintaining a list of parameters to strip, which must be updated when new tracking systems are added.
Removing the Vary: Accept-Encoding header from immutable assets is safe only when the CDN handles content negotiation at the edge. If the origin serves pre-compressed files and the CDN forwards them directly, the CDN must negotiate the encoding with the client. Most modern CDNs do this automatically, but verify with a test: fetch the resource with Accept-Encoding: gzip and Accept-Encoding: br and confirm you get the correct encoding in both cases.
Low-traffic edge locations will always have lower cache hit rates. If 2% of traffic comes from a Sydney edge server, the cache at that location may never warm up for less-popular resources. CDN “tiered caching” (where edge servers check a regional cache before going to the origin) mitigates this by reducing origin load, but does not improve the user’s first-request latency to the edge.