Skip to main content

On This Page

Reproducible Edge Kubernetes: Unifying Host and Workload with NixOS and K3s

3 min read
Share

These articles are AI-generated summaries. Please check the original sources for full details.

Code In, Cluster Out: Building Reproducible Edge Kubernetes with NixOS, K3s, and Forgejo

Developer Ces implemented a four-repository GitOps architecture to eliminate edge node drift by treating the entire stack as a single reproducible function. The system successfully deploys a RustDesk workload across a Raspberry Pi control plane and Oracle edge nodes using NixOS for host-level determinism.

Why This Matters

Traditional edge deployments often suffer from snowflake nodes where manual fixes create configuration drift that standard Kubernetes cannot solve. By extending declarative intent below the container layer to include the kernel and OS state, engineers can ensure that the host environment is as reproducible as the application itself. This model is critical for constrained edge environments where direct SSH access is risky and recovery must be a boring, automated process rather than a tribal knowledge exercise. The transition from imperative cloud-init or Ansible scripts to a declarative NixOS substrate ensures that node failure leads to an identical rebuild rather than a troubleshooting session.

Key Insights

  • Cluster-level reproducibility starts too late if the node itself remains imperative, leading to silent configuration drift before Kubernetes even starts.
  • The system utilizes a four-repository split—infrastructure-nixos, edge-cluster-infra, infrastructure-secrets, and nix-k3s-edge-cluster—to establish explicit interfaces between cloud facts and runtime behavior.
  • Colmena acts as the Nix-native convergence tool, bridging the gap between infrastructure provisioning with OpenTofu/Terramate and workload orchestration with K3s.
  • Self-hosting the control plane via Forgejo on NixOS ensures that the GitOps path remains an owned operational asset, while GitHub serves only as a public push mirror.
  • Real-world workload validation using RustDesk forces critical decisions on host-backed persistence, Tailscale-first networking, and hostNetwork integration.
  • Layered backup strategies are essential; Restic handles control plane repositories while OpenTofu state is archived in timestamped local snapshots before cloud applies.

Practical Applications

  • Use Case: Remote edge nodes in air-gapped or semi-connected environments where host reproducibility ensures deterministic rebuilds without physical access. Pitfall: Treating bootstrap as a minor detail rather than a first-class architectural phase, leading to lost access during the transition from SSH to Tailscale.
  • Use Case: Regulated systems requiring full configuration provenance from the OS kernel to the application layer for audit compliance. Pitfall: Prematurely adopting remote backends for infrastructure state in single-operator setups, adding unnecessary complexity over centralized local state with automated off-machine copies.
  • Use Case: Multi-architecture clusters involving ARM Raspberry Pis and x86 Oracle nodes managed through a unified Nix-native deployment flow. Pitfall: Relying on imperative tools like Ansible for host setup, which fails to prevent drift or ensure atomic system rollbacks.

References:

Continue reading

Next article

CodeGuard: AI-Powered Open Source Security Scanner for DevSecOps

Related Content