Valkey Complete Getting Started Guide: Production-Ready in 30 Minutes
TL;DR
Valkey is the open-source fork of Redis that’s actually faster and uses less memory. This guide takes you from zero to production, Docker and native installation, Python client setup with real code examples, clustering for high availability, persistence configuration, monitoring dashboards, and the operational mistakes that will ruin your weekend. If you’ve read our Redis to Valkey migration analysis, you know why Valkey exists. This guide shows you how to use it.
Before You Start: Why Valkey?
I’ll keep this short because we covered the politics in the migration article. The facts that matter for this tutorial:
- License: BSD 3-Clause, actually open source, no vendor lock-in
- Performance: 2-3x throughput vs Redis 7.2 thanks to multi-threaded I/O
- Memory: 20-30% less RAM usage from embedded key optimization
- Compatibility: 100% Redis protocol compatible, existing apps work
- Stability: Atomic cluster migrations, no more broken resharding
If you’re starting fresh, use Valkey. If you’re migrating from Redis, read this first.
Installation: Pick Your Poison
Docker: The Fast Path
Docker Compose is the cleanest way to run Valkey with persistence. Create a docker-compose.yml:
version: '3.8'
services:
valkey:
image: valkey/valkey:9.0
container_name: valkey
restart: unless-stopped
ports:
- "6379:6379"
volumes:
- valkey-data:/data
- ./valkey.conf:/etc/valkey/valkey.conf
command: valkey-server /etc/valkey/valkey.conf
mem_limit: 2.5g
healthcheck:
test: ["CMD", "valkey-cli", "ping"]
interval: 10s
timeout: 3s
retries: 3
volumes:
valkey-data:
driver: local
Create the config file valkey.conf:
# Network
bind 0.0.0.0
port 6379
tcp-backlog 511
timeout 0
tcp-keepalive 300
# Memory
maxmemory 2gb
maxmemory-policy allkeys-lru
maxmemory-samples 5
# Persistence
save 900 1
save 300 10
save 60 10000
appendonly yes
appendfsync everysec
auto-aof-rewrite-percentage 100
auto-aof-rewrite-min-size 64mb
# Performance (Valkey 9.0 multi-threaded I/O)
io-threads 4
io-threads-do-reads yes
# Logging
loglevel notice
logfile ""
# Slow log
slowlog-log-slower-than 10000
slowlog-max-len 128
Start it:
# Start in background
docker-compose up -d
# Verify it's running
docker-compose exec valkey valkey-cli ping
# Should return: PONG
# View logs
docker-compose logs -f valkey
# Stop
docker-compose down
Key config choices explained:
io-threads 4: Enables multi-threaded I/O, the killer feature in Valkey 9.0. Set this to the number of CPU cores you want to dedicate (typically 2-4).maxmemory 2gb: Hard limit. Valkey will evict keys or reject writes when this is hit. Always set this.maxmemory-policy allkeys-lru: When memory is full, evict the least recently used keys. Usevolatile-lruif you only want to evict keys with TTL, ornoevictionif you want writes to fail instead.appendfsync everysec: Write to disk every second. Balances durability and performance. Usealwaysfor maximum safety (slow) ornofor maximum speed (data loss risk).- Named volume
valkey-data: Your data survives container restarts and rebuilds
Cloud Managed Services: The Easy Button
If you’re on AWS or Google Cloud, let them handle the operational toil.
AWS ElastiCache for Valkey:
# Create a Valkey cluster (replace subnet group with yours)
aws elasticache create-cache-cluster \
--cache-cluster-id valkey-prod \
--engine valkey \
--engine-version 8.0 \
--cache-node-type cache.r7g.large \
--num-cache-nodes 1 \
--cache-subnet-group-name your-subnet-group \
--preferred-availability-zone us-east-1a
# For production with Multi-AZ replication
aws elasticache create-replication-group \
--replication-group-id valkey-prod-cluster \
--replication-group-description "Production Valkey cluster" \
--engine valkey \
--cache-node-type cache.r7g.large \
--num-cache-clusters 2 \
--automatic-failover-enabled \
--multi-az-enabled \
--cache-subnet-group-name your-subnet-group
# Get connection endpoint
aws elasticache describe-cache-clusters \
--cache-cluster-id valkey-prod \
--show-cache-node-info
You get automatic failover, patching, monitoring, and it’s 20% cheaper than Redis OSS tier.
Google Cloud Memorystore for Valkey:
gcloud redis instances create valkey-prod \
--size=5 \
--region=us-central1 \
--redis-version=valkey_9_0 \
--tier=standard_ha
# Get connection info
gcloud redis instances describe valkey-prod \
--region=us-central1
Both services give you a connection string. Point your app at it and you’re done.
Python Client Setup: The Right Way
The official Python client is valkey-py. Do not use redis-py for new projects (it’s controlled by Redis Ltd. and will nag you with warnings).
Installation
pip install valkey
For async support (if you’re using FastAPI, aiohttp, etc.):
pip install 'valkey[async]'
Basic Connection
from valkey import Valkey
# Simple connection
client = Valkey(
host='localhost',
port=6379,
decode_responses=True # Return strings instead of bytes
)
# Test it
client.set('hello', 'world')
print(client.get('hello')) # "world"
Production Connection Pool
Never create a new connection per request. Use connection pooling:
from valkey import ConnectionPool, Valkey
# Create pool once at app startup
pool = ConnectionPool(
host='localhost',
port=6379,
max_connections=50, # Tune based on your concurrency
socket_keepalive=True,
socket_connect_timeout=5,
socket_timeout=5,
retry_on_timeout=True,
health_check_interval=30,
decode_responses=True
)
# Reuse the pool
client = Valkey(connection_pool=pool)
Why pooling matters: Establishing a TCP connection is expensive (DNS lookup, TCP handshake, TLS negotiation if enabled). A pool reuses connections, reducing latency from ~10ms to ~0.1ms.
Authentication and Security
If you set requirepass in valkey.conf:
client = Valkey(
host='localhost',
port=6379,
password='YourSecretPassword',
decode_responses=True
)
For TLS (production over the internet):
client = Valkey(
host='valkey.example.com',
port=6380,
password='YourSecretPassword',
ssl=True,
ssl_cert_reqs='required',
ssl_ca_certs='/path/to/ca.crt',
decode_responses=True
)
Core Operations: The Five Data Structures You Actually Use
Valkey supports five core data types. Here’s how to use them in Python.
1. Strings (Key-Value)
Simplest data type. Good for caching, feature flags, counters.
# Set a value
client.set('user:1000:name', 'Alice')
# Get a value
name = client.get('user:1000:name') # "Alice"
# Set with expiration (TTL in seconds)
client.setex('session:abc123', 3600, 'user_data_here')
# Atomic increment (counters, rate limiting)
page_views = client.incr('page:home:views')
print(f"Page views: {page_views}")
# Increment by amount
client.incrby('downloads:total', 5)
# Check if key exists
if client.exists('config:maintenance_mode'):
print("Site is in maintenance mode")
# Delete a key
client.delete('temp:processing:job123')
# Get multiple keys at once (batching)
values = client.mget(['user:1:name', 'user:2:name', 'user:3:name'])
Cache-aside pattern
import json
def get_user(user_id):
cache_key = f'user:{user_id}'
# Try cache first
cached = client.get(cache_key)
if cached:
return json.loads(cached)
# Cache miss, fetch from database
user = database.query('SELECT * FROM users WHERE id = ?', user_id)
# Store in cache (1 hour TTL)
client.setex(cache_key, 3600, json.dumps(user))
return user
2. Hashes (Objects)
Store objects with multiple fields. More memory-efficient than separate keys.
# INEFFICIENT: Separate keys (don't do this)
# Each key has overhead: 8-byte pointer + malloc header (~16-32 bytes)
# client.set('user:1000:name', 'Alice') # ~50 bytes overhead
# client.set('user:1000:email', '[email protected]') # ~50 bytes overhead
# client.set('user:1000:age', 30) # ~50 bytes overhead
# Total overhead: ~150 bytes just for metadata
# EFFICIENT: Hash (do this instead)
# Single hash with embedded fields saves ~40% memory
# All fields stored together with minimal overhead
client.hset('user:1000', mapping={
'name': 'Alice',
'email': '[email protected]',
'age': 30,
'verified': 'true'
})
# Get entire object
user = client.hgetall('user:1000')
# {'name': 'Alice', 'email': '[email protected]', 'age': '30', 'verified': 'true'}
# Get a single field
email = client.hget('user:1000', 'email')
# Get multiple fields
name, age = client.hmget('user:1000', ['name', 'age'])
# Increment a numeric field
client.hincrby('user:1000', 'login_count', 1)
# Check if field exists
if client.hexists('user:1000', 'premium'):
print("User is premium")
# Delete a field
client.hdel('user:1000', 'temp_token')
3. Lists (Queues, Stacks)
Ordered collections. Perfect for job queues, activity feeds, recent items.
# Add to the right (tail)
client.rpush('tasks', 'send_email', 'process_upload', 'resize_image')
# Add to the left (head)
client.lpush('notifications', 'new_message')
# Pop from the left (FIFO queue)
task = client.lpop('tasks') # "send_email"
# Pop from the right (LIFO stack)
task = client.rpop('tasks') # "resize_image"
# Blocking pop (wait for items, perfect for worker queues)
task = client.blpop('tasks', timeout=5) # Blocks up to 5 seconds
if task:
queue_name, task_data = task
process_task(task_data)
# Get a range (pagination)
recent_posts = client.lrange('user:1000:feed', 0, 9) # First 10 items
# Get list length
count = client.llen('tasks')
# Trim to keep only recent items (cap at 100)
client.ltrim('user:1000:activity', 0, 99)
Job queue
# Producer
def enqueue_job(job_type, job_data):
job = json.dumps({'type': job_type, 'data': job_data, 'ts': time.time()})
client.rpush('jobs', job)
# Consumer (worker process)
def process_jobs():
while True:
job = client.blpop('jobs', timeout=5)
if job:
_, job_json = job
job_data = json.loads(job_json)
handle_job(job_data)
4. Sets (Unique Collections)
Unordered collections of unique items. Good for tags, permissions, tracking unique visitors.
# Add members
client.sadd('tags:post:42', 'python', 'databases', 'tutorial')
# Check membership
if client.sismember('tags:post:42', 'python'):
print("Post is tagged with Python")
# Get all members
tags = client.smembers('tags:post:42')
# {'python', 'databases', 'tutorial'}
# Remove a member
client.srem('tags:post:42', 'tutorial')
# Count members
count = client.scard('tags:post:42')
# Set operations
client.sadd('users:online:server1', 'user1', 'user2', 'user3')
client.sadd('users:online:server2', 'user2', 'user3', 'user4')
# Union (all unique users)
all_users = client.sunion('users:online:server1', 'users:online:server2')
# {'user1', 'user2', 'user3', 'user4'}
# Intersection (users on both servers)
both = client.sinter('users:online:server1', 'users:online:server2')
# {'user2', 'user3'}
# Difference (only on server1)
only_server1 = client.sdiff('users:online:server1', 'users:online:server2')
# {'user1'}
Unique visitor tracking
# Track daily unique visitors
def record_visitor(user_id):
date_key = f"visitors:{datetime.now().strftime('%Y-%m-%d')}"
client.sadd(date_key, user_id)
client.expire(date_key, 86400 * 7) # Keep for 7 days
# Get count
def get_daily_visitors():
date_key = f"visitors:{datetime.now().strftime('%Y-%m-%d')}"
return client.scard(date_key)
5. Sorted Sets (Leaderboards, Rankings)
Sets where each member has a score. Members are sorted by score. Perfect for leaderboards, priority queues, time-series data.
# Add members with scores
client.zadd('leaderboard', {
'player1': 1500,
'player2': 2000,
'player3': 1750
})
# Get rank (0-based, ascending order)
rank = client.zrank('leaderboard', 'player2') # 2 (highest)
# Get reverse rank (descending)
rank = client.zrevrank('leaderboard', 'player2') # 0 (1st place)
# Get top 10 (with scores)
top_players = client.zrevrange('leaderboard', 0, 9, withscores=True)
# [('player2', 2000.0), ('player3', 1750.0), ('player1', 1500.0)]
# Increment score
client.zincrby('leaderboard', 50, 'player1') # Add 50 points
# Get score
score = client.zscore('leaderboard', 'player1')
# Get count
total_players = client.zcard('leaderboard')
# Get by score range
mid_tier = client.zrangebyscore('leaderboard', 1500, 1800, withscores=True)
# Remove low scorers
client.zremrangebyscore('leaderboard', 0, 1000)
Time-series events
# Store events with timestamps as scores
def log_event(user_id, event_type):
key = f'events:{user_id}'
timestamp = time.time()
event_data = json.dumps({'type': event_type, 'ts': timestamp})
client.zadd(key, {event_data: timestamp})
# Keep only last 1000 events
client.zremrangebyrank(key, 0, -1001)
# Get recent events
def get_recent_events(user_id, limit=10):
key = f'events:{user_id}'
events = client.zrevrange(key, 0, limit - 1)
return [json.loads(e) for e in events]
Advanced Patterns
Rate Limiting (Sliding Window)
Better than the simple counter approach:
def is_rate_limited(user_id, max_requests=100, window_seconds=60):
key = f'rate:{user_id}'
now = time.time()
# Remove requests older than the window
client.zremrangebyscore(key, 0, now - window_seconds)
# Count requests in current window
count = client.zcard(key)
if count >= max_requests:
return True
# Add current request
client.zadd(key, {str(now): now})
client.expire(key, window_seconds)
return False
Distributed Locks
For coordinating work across multiple servers:
import uuid
def acquire_lock(lock_name, timeout=10):
identifier = str(uuid.uuid4())
lock_key = f'lock:{lock_name}'
# Try to set the lock with NX (only if not exists)
acquired = client.set(lock_key, identifier, nx=True, ex=timeout)
if acquired:
return identifier
return None
def release_lock(lock_name, identifier):
lock_key = f'lock:{lock_name}'
# Only delete if we own the lock (prevent releasing someone else's lock)
pipe = client.pipeline(True)
while True:
try:
pipe.watch(lock_key)
if pipe.get(lock_key) == identifier:
pipe.multi()
pipe.delete(lock_key)
pipe.execute()
return True
pipe.unwatch()
break
except Exception:
pass
return False
# Usage
lock_id = acquire_lock('process_payments')
if lock_id:
try:
process_payments()
finally:
release_lock('process_payments', lock_id)
Pipelining (Batching Commands)
Reduce network round trips by batching commands:
# Without pipelining (4 round trips)
client.set('key1', 'value1')
client.set('key2', 'value2')
client.set('key3', 'value3')
client.set('key4', 'value4')
# With pipelining (1 round trip)
pipe = client.pipeline()
pipe.set('key1', 'value1')
pipe.set('key2', 'value2')
pipe.set('key3', 'value3')
pipe.set('key4', 'value4')
results = pipe.execute()
# Batch cache warming
def warm_cache(user_ids):
pipe = client.pipeline()
for user_id in user_ids:
pipe.get(f'user:{user_id}')
results = pipe.execute()
return results
Monitoring: Know Before You’re Paged
Key Metrics to Watch
import time
def get_metrics():
info = client.info()
metrics = {
'memory_used_mb': info['used_memory'] / 1024 / 1024,
'memory_peak_mb': info['used_memory_peak'] / 1024 / 1024,
'connected_clients': info['connected_clients'],
'ops_per_sec': info['instantaneous_ops_per_sec'],
'keyspace_hits': info['keyspace_hits'],
'keyspace_misses': info['keyspace_misses'],
'evicted_keys': info['evicted_keys'],
'expired_keys': info['expired_keys'],
}
# Calculate hit rate
total = metrics['keyspace_hits'] + metrics['keyspace_misses']
if total > 0:
metrics['hit_rate'] = metrics['keyspace_hits'] / total * 100
else:
metrics['hit_rate'] = 0
return metrics
# Check slow queries
def get_slow_queries():
slow_log = client.slowlog_get(10)
for entry in slow_log:
print(f"Duration: {entry['duration']}μs, Command: {' '.join(entry['command'])}")
Alerts to Set Up
- Memory > 80%: You’re about to start evicting keys
- Hit rate < 80%: Cache isn’t effective, investigate
- Connected clients spiking: Possible connection leak
- Evicted keys > 0 (if using
noeviction): You’re out of memory - Slow queries > 100ms: Something’s wrong
Common Mistakes That Will Ruin Your Day
1. Not Setting maxmemory
The problem: Valkey will consume all available RAM and crash your server.
The fix:
maxmemory 2gb
maxmemory-policy allkeys-lru
2. Using Valkey as a Primary Database
Don’t do this. Valkey is in-memory. Even with persistence, you can lose data. Use a durable database for storage, Valkey for caching.
3. Storing Large Objects
The problem: Storing 100MB blobs kills performance. Valkey is optimized for < 1MB values.
The fix: Store large objects in S3/GCS, store the reference in Valkey:
# Instead of
client.set('video:123', large_video_bytes)
# Use
s3_url = upload_to_s3(large_video_bytes)
client.set('video:123:url', s3_url)
4. Not Using Connection Pooling
The problem: Creating connections is expensive. You’ll hit latency spikes.
The fix: Use ConnectionPool (shown earlier).
5. Ignoring Persistence Trade-offs
- No persistence: Fastest, but lose all data on crash
- RDB only: Snapshots, can lose minutes of data
- AOF everysec: Lose max 1 second, good balance
- AOF always: Safest, but slow
Choose based on your tolerance for data loss.
Final Thoughts
If you’re starting a new project, concider Valkey. If you’re on Redis, read the migration article and plan your move. The ecosystem is moving on. The clouds are moving on. You should too.
Welcome to the post-Redis world. Caching is faster here.
Check the official Valkey docs for deep dives.
Continue reading
Next article
Pushing Large Files to GitHub: A Technical Deep Dive (For Educational Purposes)
Related Content
Why We're Migrating from Redis to Valkey (and You Probably Should Too)
Redis killed itself with a license change. Valkey is the open-source fork that's faster, cheaper, and backed by AWS and Google. Here's what actually changed under the hood and how to migrate without downtime.
Python and SQLite in the Real World
Production-grade guide to SQLite with Python: when to use it, how to configure it correctly, and the footguns that will destroy your weekend.
Codexity Part 3: Async Web Search with DuckDuckGo
Fire multiple search queries in parallel using DuckDuckGo's Python library and asyncio. Handle rate limiting, deduplicate results, and build a resilient search layer that does not depend on paid APIs.