The Poor Man's Homelab

I run everything from a closet. One NAS, one server, one switch. No cloud bills. No orchestration platform that needs orchestrating. No vendor telling me how to structure my services.

This is not a tutorial. This is a report from the field.

The Economics

The server pulls about 25W idle, 45W under load. The NAS pulls 15W most of the time. Call it 30W average across both machines.

30W × 24 hours × 365 days = 262.8 kWh per year.

At €0.20/kWh (France average), that’s €53 annually. Add another €10 for the switch and peripherals. Round it to €63/year for electricity.

Hardware cost: €350 for the server (used), €400 for the NAS (motherboard, CPU, RAM, case), €120 for the switch, €400 for drives. Total: €1270 upfront.

Amortize that over five years because hardware doesn’t die on schedule. €254/year. Add electricity. €317/year total cost.

Now compare that to running equivalent services in the cloud.

A modest VPS with 4GB RAM and 100GB storage runs €20-40/month depending on provider. That’s bare minimum. You’re not running PostgreSQL, Valkey, file storage, photo management, and document scanning on that. Scale it properly and you’re looking at €80-120/month for compute alone.

Add managed PostgreSQL: €15-50/month depending on size and provider. Add object storage for photos and documents: €5-20/month depending on volume. Add bandwidth charges for accessing your own data: easily €10-30/month if you actually use your services.

Conservative estimate: €110/month. €1320/year.

My homelab: €317/year.

The cloud costs 4× more per year. And that’s before considering that cloud pricing is sticky upward and hardware prices are one-time.

This math doesn’t include the VPS I rent for €4/month to run frps for remote and WireGuard access. Add that if you want. It doesn’t change the conclusion.

The cloud makes sense when you have variable load, need geographic distribution, or don’t want to think about hardware. I have consistent load, single-region needs, and hardware doesn’t bother me. The economics are obvious.

The Hardware

Server: Intel i5-9500T, six cores, 16GB RAM. Low TDP, decent single-thread, cheap used. Runs Debian. Bought it for €350 because the previous owner upgraded.

NAS: Intel N100, four cores, 8GB RAM, two HDD bays, two NVMe slots. Purpose-built low-power board. Runs Open Media Vault. Cost €400 for the board, CPU, RAM, and case. Add drives separately.

Switch: Basic managed switch with VLAN support. €120. Nothing fancy.

This is deliberately unimpressive hardware. High-end would cost more and sit idle. Low-end would struggle. This is the correct amount of hardware for me.

The server handles compute. APIs, Traefik, Valkey, monitoring, background jobs. The NAS handles storage and databases. PostgreSQL runs on NVMe with DRAM cache. Fast enough for anything I throw at it until I throw something genuinely demanding at it.

The Split

Separating compute and storage is not dogma. It’s practical.

The server crashes? Data persists on the NAS. The NAS reboots for updates? Services keep running, just without database access until it’s back. This is not high availability. This is basic fault isolation.

PostgreSQL lives on the NAS because that’s where the fast storage is. NVMe with DRAM cache. Valkey runs on the server for low-latency access. Network latency over a dedicated VLAN is negligible for my usecase.

Application state lives in the database. User uploads live in datasets on the NAS. Container state lives nowhere because containers are ephemeral. This separation makes backups straightforward and restores possible.

Docker Compose

Kubernetes is for problems I don’t have.

Every service is a compose file. Traefik, APIs, workers, monitoring. Start with docker compose up -d. Stop with docker compose down. Update with git pull && docker compose up -d --build.

services:
  api:
    build: .
    restart: unless-stopped
    environment:
      DATABASE_URL: postgresql://nas.internal:5432/app
      VALKEY_URL: valkey://localhost:6379
    labels:
      - "traefik.enable=true"
      - "traefik.http.routers.api.rule=Host(`api.example.com`)"
      - "traefik.http.routers.api.entrypoints=websecure"
      - "traefik.http.routers.api.tls.certresolver=cloudflare"
    networks:
      - web
      - backend

No Helm. No operators. No custom resources. A container, environment variables, and labels that tell Traefik how to route traffic.

This scales by running more containers. It load balances through Traefik. It monitors through whatever you can fit in a compose file. Prometheus works. So does a cron job that curls endpoints and emails you if they’re down.

Traefik

Traefik runs on the server. Handles everything inbound. TLS termination, routing, rate limiting, fail2ban for brute-force protection. Discovers services through Docker labels.

Add a container with the right labels and it goes live. Remove the container and it’s gone. Configuration is static YAML for the core, dynamic labels for services.

Deployment Workflow

Personal projects build through Gitea Actions. Code pushed to local Gitea repository triggers workers that build containers. Watchtower monitors for new images and restarts services automatically. No manual deploys. Push to main, wait two minutes, service updates.

This is CI/CD without the enterprise overhead. Local repository. Local runners. Local registry. Fast feedback loop.

services:
  traefik:
    image: traefik:latest
    restart: unless-stopped
    ports:
      - "80:80"
      - "443:443"
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock:ro
      - ./traefik.yml:/traefik.yml:ro
      - ./acme.json:/acme.json
    networks:
      - web

Let’s Encrypt certificates via DNS challenge through Cloudflare. Renews automatically. HTTP redirects to HTTPS automatically. This is infrastructure that doesn’t need attention.

Traefik also forwards to NAS services. Paperless, Immich, Gitea, they run on the NAS but Traefik on the server is the entry point. One reverse proxy. One TLS endpoint. One place to check logs when something breaks.

# Static route to NAS service
http:
  routers:
    paperless:
      rule: "Host(`docs.internal.example.com`)"
      service: paperless-svc
      tls:
        certResolver: cloudflare
  services:
    paperless-svc:
      loadBalancer:
        servers:
          - url: "http://nas.internal:8000"

Tunnels and Exposure

Nothing faces the internet directly. Everything behind NAT.

Public APIs use cloudflared tunnels. Cloudflare’s edge terminates TLS, handles DDoS, caches GET requests. The tunnel connects outbound from my network to Cloudflare. No port forwarding. No exposed ports. No one knows my home IP.

services:
  cloudflared:
    image: cloudflare/cloudflared:latest
    restart: unless-stopped
    command: tunnel --no-autoupdate run
    environment:
      TUNNEL_TOKEN: ${TUNNEL_TOKEN}
    networks:
      - web

Traffic flows: Internet → Cloudflare edge → tunnel → Traefik → service. Cloudflare caches aggressively. Responses are fast. Origin traffic is low. This is what edge infrastructure is for.

Personal services, Kavita, Paperless, Immich, Syncthing, don’t need public access. They go through WireGuard. Encrypted point-to-point from my phone or laptop to the homelab.

But WireGuard needs a public endpoint. That’s the VPS.

The VPS Strategy

A €4/month VPS running frps. That’s it. No services, no data, just a proxy daemon.

The VPS has a public IP. My home network runs frpc (the client). It connects outbound to the VPS. WireGuard listens on the home server but is accessible through the VPS.

[My Phone] → [VPS:51820] → [frps] → [frpc] → [Home Server:51820] → [WireGuard]

Once WireGuard connects, I’m on the home network. Traefik routes traffic. Services are accessible. The VPS knows nothing about the services. It’s a dumb pipe with a public IP.

If the VPS dies, spin up another one, update DNS, done. Five minutes.

Firewalling

UFW on both machines. Default deny. Explicit allows.

# Server
ufw default deny incoming
ufw default allow outgoing
ufw allow from 192.168.10.0/24 to any port 22
ufw allow from 192.168.10.0/24 to any port 80
ufw allow from 192.168.10.0/24 to any port 443
ufw allow 51820/udp
ufw enable

The NAS is more restrictive. Only the server and my workstation can reach it. No internet-facing services. No exceptions.

Docker bypasses UFW because it writes its own iptables rules. This is annoying. The fix: bind services to localhost or specific interfaces.

ports:
  - "127.0.0.1:5432:5432"

Now PostgreSQL is accessible only from localhost or through Docker networks. Not from the broader network.

Cloudflare

Cloudflare does four things:

DNS: Fast, free, API-accessible.
Edge caching: GET requests cached close to users.
DDoS protection: Someone hammers my API, Cloudflare absorbs it.
TLS: Let’s Encrypt via DNS challenge.

The free tier covers all of this. I’m not paying Cloudflare. Cloudflare isn’t paying me. This is just pragmatic use of available infrastructure.

Cloudflare sees all traffic to public APIs. That’s the trade-off. If that bothers you, run your own edge. It doesn’t bother me enough to justify the complexity.

Services

Server runs:

Traefik
Valkey
cloudflared
frpc
Public APIs
Background workers
Monitoring (Prometheus, Grafana, Uptime Kuma)
Watchtower (automated container updates)
Gitea Actions runners

NAS runs:

PostgreSQL
Paperless-ngx
Immich
Gitea
Kavita
Syncthing
Automated backups

Each service isolated. Each can fail, restart, or update independently. This is the correct granularity for infrastructure you maintain alone.

Backups

Backups are not negotiable. They’re the first thing you set up, not the last.

Two backup targets because one size does not fit all data.

S3 Glacier: Photos and media. Large files. Rarely accessed. Cheap storage, expensive retrieval, slow restores. Glacier charges $0.0036/GB/month for storage. Retrieval takes 3-5 hours for standard, 12 hours for bulk. This is acceptable for data I hope to never need but absolutely will need eventually.

Backblaze B2: Databases and configuration. Small files. Frequent changes. Fast retrieval. $0.005/GB/month for storage. Download your data and you’re done. No waiting. This is what you use when you need your database backup now, not in four hours.

The split is deliberate. Photos from 2019? Wait for Glacier. PostgreSQL dump from last night? Backblaze has it ready.

What Gets Backed Up

Daily:

PostgreSQL dumps (compressed, encrypted)
Application config directories
Docker Compose files
Traefik config and certificates

Weekly:

Photos and uploads (incremental)
Gitea repositories
Paperless documents

Never:

Containers (rebuild from compose files)
Caches (Redis, application caches)
Logs older than 30 days
System packages
Media downloaded from external sources

If I can recreate it in under an hour, I don’t back it up. Storage and bandwidth cost money. Spend them on irreplaceable data.

Encryption and Retention

Everything encrypted before leaving the network. restic handles encryption and deduplication. GPG key on a Yubikey, backup copy on paper in a safe.

Retention:

Daily: keep 7
Weekly: keep 4
Monthly: keep 12
Yearly: keep indefinitely

This gives me a week of daily history, a month of weekly snapshots, a year of monthly snapshots, and long-term yearly archives. Enough to recover from mistakes. Not so much that I’m paying to store redundant backups.

Failure Scenarios

NAS dies: Restore PostgreSQL from Backblaze. Valkey runs on the server so it’s unaffected. Services back online in 2-4 hours. Photos can wait.

Server dies: Valkey cache is lost but rebuilds on restart. PostgreSQL on NAS is safe. Spin up replacement server, deploy containers, point to existing database. Downtime: 1-2 hours.

Data corruption: Restore from most recent clean backup. Might lose up to 24 hours of data. Acceptable for personal infrastructure.

Both NAS and all backup targets fail simultaneously: Accept that catastrophic cascading failures of independent systems mean larger problems exist.

Backups are boring. They run in the background. You don’t think about them until you need them. Then they’re the only thing that matters.

Test your restores. I do this every year. Not because I enjoy it. Because restoring for the first time during an emergency is how you discover your backups don’t work.

Hardware Reality

The i5-9500T has six cores and idles most of the time. Average CPU usage: 12-18%. The bottleneck is never CPU. It’s network or disk or, most commonly, waiting for external APIs.

16GB RAM is enough because I’m running services, not databases for millions of users. PostgreSQL gets 2GB. Valkey gets 512MB. Everything else shares the rest. zram provides compressed memory when needed. No file swap.

The N100 on the NAS is even lighter. Four cores, 8GB RAM. Handles PostgreSQL and file operations without stress. This CPU costs €120. It’s designed for low power, not high performance. It’s adequate because the workload is not demanding.

The NVMe slots on the NAS are critical. NVMe with DRAM cache means database operations are fast. HDD bays are for bulk storage. Photos, documents, media. Speed matters less. Cost per gigabyte matters more.

Where This Breaks

Video encoding would choke these machines. So would transcoding multiple 4K streams. So would training ML models or running simulation workloads.

I don’t do those things. If I did, I’d rent GPU time or buy different hardware. You optimize for your workload, not someone else’s.

Kubernetes would struggle on this hardware. The control plane alone wants 2GB RAM. etcd wants more. The overhead is not justified by any benefit for single-machine deployments.

Running databases on the NAS makes sense because the NAS has NVMe. If the NAS had spinning rust only, this would be a different conversation.

Backups run at 3 AM because they saturate the network and use CPU for compression. During the day, services have priority. At night, backups get the machine to themselves. This is not optimal. This is working within constraints.

Performance Myths

“You need ECC RAM.” For what? On the server, I’m running web services, not financial transactions. Non-ECC is fine. On the NAS, depends on your filesystem and paranoia level.

“You need 10GbE networking.” For most homelabs, gigabit is plenty. Database queries over NFS work fine. Backups are the only thing that saturates gigabit, and they run when nothing else is happening.

“Low-power CPUs are too slow.” Too slow for what? If you’re network-bound or I/O-bound, faster CPUs just idle at higher clock speeds.

The hardware is sufficient because the architecture acknowledges what the hardware can do. Design within your constraints and most constraints stop being limitations.

Monitoring

Uptime Kuma pings services every five minutes. HTTP checks, TCP checks, keyword checks. Something down? Notification sent. Something slow? Notification sent. Something returning errors? Notification sent.

Prometheus scrapes metrics. Grafana displays them. I look at dashboards maybe once a week unless something breaks. Disk usage, memory usage, request rates, error rates.

No distributed tracing. No APM vendor. No log aggregation platform. Logs go to stdout. Docker captures them. docker compose logs -f service_name when needed.

This is sufficient. Enterprise monitoring is for enterprises. Homelabs need to know when something is broken and have enough context to fix it. Uptime Kuma and Prometheus provide that.

What This Is Not

Not highly available. Single points of failure everywhere. The server dies, services stop. The NAS dies, data access stops. The switch dies, everything stops. Acceptable downtime for personal infrastructure is measured in hours, not seconds.

Not zero-configuration. You need to understand Docker, networking, DNS, and TLS. If you don’t, acquire that knowledge before running infrastructure that matters.

Not secure against determined attackers. This is secure against opportunistic scanning and automated exploitation. A targeted attack with resources would succeed. That’s not my threat model.

Not free. Hardware costs money upfront. Electricity costs money monthly. Time costs opportunity cost. But it’s cheaper than the cloud and I control it entirely.

Why This Works

The cloud optimizes for variable load and horizontal scale. I have consistent load and vertical sufficiency.

Docker Compose optimizes for services I can understand and deploy myself. Kubernetes optimizes for services deployed by teams with CI/CD pipelines.

Cloudflare optimizes for protecting public services from the internet. My homelab optimizes for not being on the internet directly.

WireGuard and frp together solve the NAT traversal problem without punching holes in the firewall or running complex VPN infrastructure.

The pieces fit because they’re simple pieces solving specific problems. No platforms. No frameworks. No “solutions.”

What Breaks

Cloudflare tunnels disconnect occasionally. They reconnect automatically. Five seconds of downtime. Monthly. I’ve accepted this.

Docker image updates break things. Pin versions for anything that matters. Watchtower handles the rest.

Open Media Vault updates require occasional reboots. Reboots mean downtime. Schedule them. Do them at 3 AM.

Home internet goes down. Everything goes down. I have a 5G failover. It’s slow and capped. It’s better than nothing.

Power outages happen. UPS on both machines and the switch. Ten minutes runtime. Enough to shut down gracefully or ride out brief outages. Not enough to survive extended outages. I’ve made peace with this.

What I’d Change

Eight-bay NAS instead of two bays. Storage needs grow. More bays bought upfront is cheaper than migrating later.

Separate physical machine for public services. Current setup mixes them on the server. Works but inelegant. A second low-power box for isolated public workloads would be cleaner.

Better documentation from day one. I know how it works because I built it. I’ve forgotten details twice. A network diagram and runbook would save time.

Not much else. This setup evolved through operation. It works. That’s the validation that matters.

Why Not the Cloud

The cloud is rented capacity. I’m building owned capability.

The cloud charges for egress. I access my photos constantly. That would cost real money. On the NAS it’s free.

The cloud optimizes for businesses. I’m not a business. My needs are different. My economics are different.

The cloud makes sense for some workloads. Bursty traffic. Geographic distribution. Compliance requirements. Avoiding hardware responsibility.

My workload is steady. Single-region. No compliance beyond “don’t lose my photos.” I’m comfortable with hardware. The cloud doesn’t make sense here.

This is not ideology. This is economics and engineering.

What This Is

Infrastructure that fits its purpose. Services run. Data persists. Costs stay low. I stay in control.

A homelab for running services, not cosplaying as a datacenter. Working software, not perfect architecture.

A setup that acknowledges trade-offs explicitly. High availability costs money and complexity. Geographic redundancy costs more. I’ve chosen low cost and acceptable reliability. That’s valid.

A reminder that cloud-scale problems need cloud-scale tools. Non-cloud-scale problems do not. Docker Compose is sufficient. A single server is sufficient. A homelab is sufficient.

It works. It keeps working. That’s all.

On This Page