Skip to main content

On This Page

Optimizing Multi-Subnet Kubernetes Networking with Tailscale and Cilium eBPF

2 min read
Share

These articles are AI-generated summaries. Please check the original sources for full details.

An Iximiuz Cluster of Clusters with Tailscale and Cilium

Engineer Adam Leskis architected a multi-subnet Kubernetes cluster using Tailscale for cross-playground connectivity and Cilium for eBPF-based observability. The project integrated 12 nodes across 5 different subnets to test the limits of overlay-on-overlay networking.

Why This Matters

The project exposes the performance degradation inherent in nested overlay networks, specifically Cilium VXLANs encapsulated within Tailscale tunnels. Real-world constraints, such as shared public IPs across different subnets, forced Tailscale into relayed connections, introducing enough jitter to break KEDA scaling and metrics collection. This demonstrates that while flat network abstractions are powerful, the underlying transport layer’s latency and jitter can invalidate high-level orchestration features like HPA and service mesh telemetry.

Key Insights

  • Cilium’s eBPF-based service mesh was selected over Istio or Envoy to minimize resource usage on iximiuz Labs VMs limited to 4CPU and 8GiB RAM.
  • Nested overlay networking (VXLAN over Tailscale) across 5 subnets caused intermittent connectivity failures due to relayed Tailscale connections and high jitter.
  • The hubble-gazer tool was developed using Go and React to consume Hubble-Observatory data via Server-Sent Events (SSE) rather than WebSockets for live traffic visualization.
  • Network stability was only achieved by consolidating all worker nodes into a single subnet, restricting Tailscale to control-plane-to-worker communication.
  • The environment utilized iximiuz Labs playgrounds, which impose an 8-hour maximum lifetime, requiring automated scripting for cluster bootstrapping.

Practical Applications

  • Monitoring L4/L7 traffic and DNS queries in real-time using Cilium Hubble and a custom SSE-based frontend like hubble-gazer.
  • Pitfall: Implementing double-encapsulation (VXLAN over WireGuard/Tailscale) in high-latency environments leads to metric server timeouts and HPA failures.
  • Deploying a distributed K8s control plane across disparate networks using Tailscale hostnames for node addressing.
  • Pitfall: Relying on relayed Tailscale connections across overlapping public IP spaces, which introduces excessive latency for cluster-internal traffic.

References:

Continue reading

Next article

Building a Global Engineering Team and AI Agents with Netlify CTO Dana Lawson

Related Content