Redis Caching Patterns and Cache Invalidation

The Feature

Frequently accessed, rarely changed data is served from Redis instead of PostgreSQL. Market listings, vendor counts, and dashboard statistics load from cache on cache hit (sub-millisecond) and fall back to the database on cache miss (10-50 ms). Cache entries expire automatically and are invalidated explicitly when the underlying data changes.

The Decision

The cache-aside pattern (also called lazy loading) is the simplest and safest caching pattern. The application checks the cache first. On a miss, it queries the database, stores the result in cache, and returns it. On a hit, it returns the cached value directly. The application never writes to the cache except after a database read or after a database write (to invalidate).

Write-through and write-behind caching patterns are more complex and more dangerous. If the cache write fails in a write-through pattern, the cache becomes stale. If the background sync fails in a write-behind pattern, data is lost. Cache-aside avoids both problems: the worst case is a cache miss, which falls back to the database.

The Implementation

Cache Key Naming Convention

# Key format: {entity}:{id}:{variant}
# Examples:
# market:550e8400-...:detail      - Full market details with vendors
# market:550e8400-...:summary     - Market summary for listing page
# markets:active:page:1:size:20   - Paginated active market listing
# vendor:660f9500-...:dashboard   - Vendor dashboard data
# stats:market:550e8400-...       - Market statistics (vendor count, etc.)

A consistent naming convention makes it possible to invalidate all cache entries for a specific entity. When a market is updated, deleting all keys matching market:{id}:* clears every cached representation of that market.

Caching Market Listings

# backend/app/routers/markets.py
from app.services.cache import cache_get, cache_set, cache_delete


@router.get("/markets")
async def list_markets(
    page: int = 1,
    page_size: int = 20,
    city: str | None = None,
    db: AsyncSession = Depends(get_db),
):
    # Build cache key from all query parameters
    cache_key = f"markets:active:page:{page}:size:{page_size}"
    if city:
        cache_key += f":city:{city}"

    cached = await cache_get(cache_key)
    if cached is not None:
        return cached

    # Cache miss: query database
    query = (
        select(Market)
        .where(Market.status == "active")
        .order_by(Market.name)
        .offset((page - 1) * page_size)
        .limit(page_size)
    )
    if city:
        query = query.where(Market.city == city)

    result = await db.execute(query)
    markets = [m.to_summary_dict() for m in result.scalars().all()]

    # Count total for pagination
    count_query = select(func.count(Market.id)).where(Market.status == "active")
    if city:
        count_query = count_query.where(Market.city == city)
    total = (await db.execute(count_query)).scalar()

    response = {
        "markets": markets,
        "total": total,
        "page": page,
        "page_size": page_size,
    }

    # Cache for 5 minutes
    await cache_set(cache_key, response, ttl_seconds=300)
    return response

TTL Strategy

Data Type	TTL	Reasoning
Market listings	5 minutes	Markets rarely change; stale data is acceptable
Market detail	5 minutes	Same as listings
Dashboard statistics	2 minutes	Should feel reasonably current
Vendor profile	10 minutes	Rarely changes
Public pages	15 minutes	Content changes are infrequent

Short TTLs (1-5 minutes) are conservative. They limit the window of stale data while still reducing database load. Longer TTLs provide more cache hits but risk showing outdated information.

Cache Invalidation on Writes

# backend/app/routers/markets.py

@router.put("/markets/{market_id}")
async def update_market(
    market_id: str,
    updates: MarketUpdate,
    market: Market = Depends(get_market_for_organizer),
    db: AsyncSession = Depends(get_db),
):
    for key, value in updates.dict(exclude_unset=True).items():
        setattr(market, key, value)
    await db.commit()

    # Invalidate all cached representations of this market
    await cache_delete(f"market:{market_id}:*")

    # Also invalidate market listing pages (since the market data changed)
    await cache_delete("markets:active:*")

    return market.to_dict()


@router.post("/{application_id}/accept")
async def accept_application(
    application_id: str,
    market: Market = Depends(get_market_for_organizer),
    db: AsyncSession = Depends(get_db),
):
    # ... accept logic ...

    # Invalidate market stats (vendor count changed)
    await cache_delete(f"stats:market:{market.id}")
    # Invalidate vendor's dashboard
    await cache_delete(f"vendor:{application.vendor_id}:*")

    return {"status": "accepted"}

Redis Memory Configuration

# docker-compose.prod.yml (Redis service)
services:
  redis:
    image: redis:7-alpine
    command: >
      redis-server
      --maxmemory 128mb
      --maxmemory-policy allkeys-lru
      --save ""
      --appendonly no
    restart: unless-stopped

Configuration explained:

maxmemory 128mb: Limits Redis to 128 MB of RAM. On a 4 GB VPS running the backend, PostgreSQL, and Redis, 128 MB for cache is a reasonable allocation.
maxmemory-policy allkeys-lru: When memory is full, Redis evicts the least recently used keys. This is the correct policy for a cache (not noeviction, which would reject new writes).
save "" and appendonly no: Disables persistence. Redis is a cache, not a database. If it restarts, the cache rebuilds from the database automatically through cache misses.

Graceful Degradation

# The cache service already handles Redis failures silently.
# If Redis is down, every request is a "cache miss" and hits the database.
# The application works correctly, just slower.

# Verify this works by stopping Redis:
# docker compose stop redis
# The application should continue functioning with higher database load.

The Trap

# TRAP: Caching database query results that include ORM objects
@router.get("/markets/{market_id}")
async def get_market(market_id: str, db: AsyncSession = Depends(get_db)):
    result = await db.execute(select(Market).where(Market.id == market_id))
    market = result.scalar_one()
    await cache_set(f"market:{market_id}", market)
    # Fails: SQLAlchemy model objects are not JSON serializable
    # Even if serialized, they contain internal state and session references

# SAFE: Cache plain dictionaries
@router.get("/markets/{market_id}")
async def get_market(market_id: str, db: AsyncSession = Depends(get_db)):
    result = await db.execute(select(Market).where(Market.id == market_id))
    market = result.scalar_one()
    data = market.to_dict()  # Convert to plain dict first
    await cache_set(f"market:{market_id}:detail", data)
    return data

Always convert ORM objects to plain dictionaries or Pydantic models before caching. ORM objects carry session state, lazy-loading proxies, and internal references that break serialization and deserialization.

The Cost

Component	Resource Usage
Redis container	~50 MB idle, 128 MB max
VPS RAM remaining	~3.8 GB for app + PostgreSQL
Redis data volume	Negligible at Marketflow’s scale

Redis running as a Docker container on the same VPS adds zero infrastructure cost. It borrows 128 MB of the existing 4 GB RAM allocation. The trade-off is less RAM for PostgreSQL’s shared buffers, but at Marketflow’s data volume, PostgreSQL’s working set fits comfortably in the remaining memory.