Redis Caching Patterns and Cache Invalidation
Redis Caching Patterns and Cache Invalidation
The Feature
Frequently accessed, rarely changed data is served from Redis instead of PostgreSQL. Market listings, vendor counts, and dashboard statistics load from cache on cache hit (sub-millisecond) and fall back to the database on cache miss (10-50 ms). Cache entries expire automatically and are invalidated explicitly when the underlying data changes.
The Decision
The cache-aside pattern (also called lazy loading) is the simplest and safest caching pattern. The application checks the cache first. On a miss, it queries the database, stores the result in cache, and returns it. On a hit, it returns the cached value directly. The application never writes to the cache except after a database read or after a database write (to invalidate).
Write-through and write-behind caching patterns are more complex and more dangerous. If the cache write fails in a write-through pattern, the cache becomes stale. If the background sync fails in a write-behind pattern, data is lost. Cache-aside avoids both problems: the worst case is a cache miss, which falls back to the database.
The Implementation
Cache Key Naming Convention
# Key format: {entity}:{id}:{variant}
# Examples:
# market:550e8400-...:detail - Full market details with vendors
# market:550e8400-...:summary - Market summary for listing page
# markets:active:page:1:size:20 - Paginated active market listing
# vendor:660f9500-...:dashboard - Vendor dashboard data
# stats:market:550e8400-... - Market statistics (vendor count, etc.)
A consistent naming convention makes it possible to invalidate all cache entries for a specific entity. When a market is updated, deleting all keys matching market:{id}:* clears every cached representation of that market.
Caching Market Listings
# backend/app/routers/markets.py
from app.services.cache import cache_get, cache_set, cache_delete
@router.get("/markets")
async def list_markets(
page: int = 1,
page_size: int = 20,
city: str | None = None,
db: AsyncSession = Depends(get_db),
):
# Build cache key from all query parameters
cache_key = f"markets:active:page:{page}:size:{page_size}"
if city:
cache_key += f":city:{city}"
cached = await cache_get(cache_key)
if cached is not None:
return cached
# Cache miss: query database
query = (
select(Market)
.where(Market.status == "active")
.order_by(Market.name)
.offset((page - 1) * page_size)
.limit(page_size)
)
if city:
query = query.where(Market.city == city)
result = await db.execute(query)
markets = [m.to_summary_dict() for m in result.scalars().all()]
# Count total for pagination
count_query = select(func.count(Market.id)).where(Market.status == "active")
if city:
count_query = count_query.where(Market.city == city)
total = (await db.execute(count_query)).scalar()
response = {
"markets": markets,
"total": total,
"page": page,
"page_size": page_size,
}
# Cache for 5 minutes
await cache_set(cache_key, response, ttl_seconds=300)
return response
TTL Strategy
| Data Type | TTL | Reasoning |
|---|---|---|
| Market listings | 5 minutes | Markets rarely change; stale data is acceptable |
| Market detail | 5 minutes | Same as listings |
| Dashboard statistics | 2 minutes | Should feel reasonably current |
| Vendor profile | 10 minutes | Rarely changes |
| Public pages | 15 minutes | Content changes are infrequent |
Short TTLs (1-5 minutes) are conservative. They limit the window of stale data while still reducing database load. Longer TTLs provide more cache hits but risk showing outdated information.
Cache Invalidation on Writes
# backend/app/routers/markets.py
@router.put("/markets/{market_id}")
async def update_market(
market_id: str,
updates: MarketUpdate,
market: Market = Depends(get_market_for_organizer),
db: AsyncSession = Depends(get_db),
):
for key, value in updates.dict(exclude_unset=True).items():
setattr(market, key, value)
await db.commit()
# Invalidate all cached representations of this market
await cache_delete(f"market:{market_id}:*")
# Also invalidate market listing pages (since the market data changed)
await cache_delete("markets:active:*")
return market.to_dict()
@router.post("/{application_id}/accept")
async def accept_application(
application_id: str,
market: Market = Depends(get_market_for_organizer),
db: AsyncSession = Depends(get_db),
):
# ... accept logic ...
# Invalidate market stats (vendor count changed)
await cache_delete(f"stats:market:{market.id}")
# Invalidate vendor's dashboard
await cache_delete(f"vendor:{application.vendor_id}:*")
return {"status": "accepted"}
Redis Memory Configuration
# docker-compose.prod.yml (Redis service)
services:
redis:
image: redis:7-alpine
command: >
redis-server
--maxmemory 128mb
--maxmemory-policy allkeys-lru
--save ""
--appendonly no
restart: unless-stopped
Configuration explained:
maxmemory 128mb: Limits Redis to 128 MB of RAM. On a 4 GB VPS running the backend, PostgreSQL, and Redis, 128 MB for cache is a reasonable allocation.maxmemory-policy allkeys-lru: When memory is full, Redis evicts the least recently used keys. This is the correct policy for a cache (notnoeviction, which would reject new writes).save ""andappendonly no: Disables persistence. Redis is a cache, not a database. If it restarts, the cache rebuilds from the database automatically through cache misses.
Graceful Degradation
# The cache service already handles Redis failures silently.
# If Redis is down, every request is a "cache miss" and hits the database.
# The application works correctly, just slower.
# Verify this works by stopping Redis:
# docker compose stop redis
# The application should continue functioning with higher database load.
The Trap
# TRAP: Caching database query results that include ORM objects
@router.get("/markets/{market_id}")
async def get_market(market_id: str, db: AsyncSession = Depends(get_db)):
result = await db.execute(select(Market).where(Market.id == market_id))
market = result.scalar_one()
await cache_set(f"market:{market_id}", market)
# Fails: SQLAlchemy model objects are not JSON serializable
# Even if serialized, they contain internal state and session references
# SAFE: Cache plain dictionaries
@router.get("/markets/{market_id}")
async def get_market(market_id: str, db: AsyncSession = Depends(get_db)):
result = await db.execute(select(Market).where(Market.id == market_id))
market = result.scalar_one()
data = market.to_dict() # Convert to plain dict first
await cache_set(f"market:{market_id}:detail", data)
return data
Always convert ORM objects to plain dictionaries or Pydantic models before caching. ORM objects carry session state, lazy-loading proxies, and internal references that break serialization and deserialization.
The Cost
| Component | Resource Usage |
|---|---|
| Redis container | ~50 MB idle, 128 MB max |
| VPS RAM remaining | ~3.8 GB for app + PostgreSQL |
| Redis data volume | Negligible at Marketflow’s scale |
Redis running as a Docker container on the same VPS adds zero infrastructure cost. It borrows 128 MB of the existing 4 GB RAM allocation. The trade-off is less RAM for PostgreSQL’s shared buffers, but at Marketflow’s data volume, PostgreSQL’s working set fits comfortably in the remaining memory.