Skip to main content
mastering ckad certified kubernetes application developer

StatefulSets, DaemonSets, and When to Use Each

11 min read Chapter 69 of 87
Summary

Covers StatefulSets in depth: stable Pod identity with...

Covers StatefulSets in depth: stable Pod identity with ordinal naming, headless Service for stable DNS, volumeClaimTemplates for per-Pod persistent storage, ordered vs parallel Pod management, and update strategies. Covers DaemonSets: one Pod per node, use cases for cluster-wide agents, nodeSelector for targeted scheduling, update strategies, and tolerations. Includes complete YAML manifests for both resource types and three hands-on exercises.

StatefulSets, DaemonSets, and When to Use Each

Deployments treat Pods as interchangeable cattle — any Pod can be replaced by any other Pod without affecting the application. This works for stateless web servers, API gateways, and worker processes. It fails catastrophically for databases, message brokers, and distributed consensus systems that depend on stable identities and persistent storage.

Kubernetes provides two specialized workload controllers for scenarios where Deployments fall short: StatefulSets for applications that need stable identity and storage, and DaemonSets for applications that need exactly one Pod on every node.

StatefulSets

The Problem with Deployments for Stateful Apps

Consider deploying a three-node PostgreSQL cluster with streaming replication. Each node needs:

  • A stable hostname so replicas can connect to the primary at a predictable address
  • Persistent storage that survives Pod restarts and rescheduling — and that stays bound to the same Pod identity
  • Ordered startup so the primary initializes before replicas attempt to connect

A Deployment provides none of these. Pod names are random hashes (postgres-7d6f8b5c4d-xkm2p). PVCs are shared or must be manually managed. Pods start and stop in arbitrary order. A StatefulSet addresses all three requirements.

StatefulSet Guarantees

A StatefulSet provides three guarantees that Deployments do not:

1. Stable, unique Pod identity. Each Pod gets a predictable name derived from the StatefulSet name and an ordinal index: web-0, web-1, web-2. This name persists across restarts and rescheduling. If web-1 is deleted, the replacement Pod is also named web-1.

2. Stable, persistent storage. Each Pod gets its own PersistentVolumeClaim through volumeClaimTemplates. The PVC is named <template-name>-<pod-name> (e.g., data-web-0). When a Pod is rescheduled to a different node, its PVC follows it (assuming the storage class supports it). When a Pod is deleted, its PVC is not deleted — the data persists.

3. Ordered, graceful management. By default, Pods are created in order (0, 1, 2) and terminated in reverse order (2, 1, 0). Each Pod must be Running and Ready before the next Pod is created. This ensures the primary database node is available before replicas attempt to connect.

Pod Naming and Ordinal Index

StatefulSet Pods follow the naming pattern <statefulset-name>-<ordinal>:

web-0    # First Pod (ordinal 0)
web-1    # Second Pod (ordinal 1)
web-2    # Third Pod (ordinal 2)

The ordinal index is stable — if web-1 fails and is replaced, the new Pod is still web-1. This stability allows applications to embed identity into their configuration. A Kafka broker can derive its broker ID from the ordinal. A Redis Cluster node can derive its slot assignment.

Headless Service Requirement

A StatefulSet requires a headless Service — a Service with clusterIP: None. This Service doesn’t load-balance traffic to a random Pod. Instead, it creates individual DNS records for each Pod:

web-0.nginx-headless.default.svc.cluster.local
web-1.nginx-headless.default.svc.cluster.local
web-2.nginx-headless.default.svc.cluster.local

The DNS record format is <pod-name>.<service-name>.<namespace>.svc.cluster.local. Each record resolves to the Pod’s IP address, allowing other Pods to connect to a specific StatefulSet member by name.

The headless Service is defined separately from the StatefulSet and linked via the serviceName field:

apiVersion: v1
kind: Service
metadata:
  name: nginx-headless
spec:
  clusterIP: None
  selector:
    app: nginx
  ports:
    - port: 80
      targetPort: 80

volumeClaimTemplates

Instead of referencing an existing PVC, a StatefulSet defines PVC templates that generate a unique PVC for each Pod:

volumeClaimTemplates:
  - metadata:
      name: data
    spec:
      accessModes: ["ReadWriteOnce"]
      resources:
        requests:
          storage: 1Gi

This template creates PVCs named data-web-0, data-web-1, data-web-2. Each PVC is bound to its respective Pod and is not deleted when the StatefulSet is scaled down or deleted. This is intentional — persistent data should survive workload changes.

To reclaim storage after a StatefulSet is deleted, manually delete the PVCs:

kubectl delete pvc data-web-0 data-web-1 data-web-2

Complete StatefulSet YAML

apiVersion: v1
kind: Service
metadata:
  name: nginx-headless
  labels:
    app: nginx
spec:
  clusterIP: None
  selector:
    app: nginx
  ports:
    - port: 80
      name: web
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: web
spec:
  serviceName: nginx-headless
  replicas: 3
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
        - name: nginx
          image: nginx:1.25
          ports:
            - containerPort: 80
              name: web
          volumeMounts:
            - name: data
              mountPath: /usr/share/nginx/html
  volumeClaimTemplates:
    - metadata:
        name: data
      spec:
        accessModes: ["ReadWriteOnce"]
        resources:
          requests:
            storage: 1Gi

Key fields:

  • serviceName — must match the headless Service name. This field is required; the StatefulSet won’t be created without it.
  • selector.matchLabels — must match the Pod template labels, same as Deployments.
  • volumeClaimTemplates — defines PVCs that are created per Pod. The name here (data) matches the volumeMounts[].name in the container spec.

podManagementPolicy

The default policy is OrderedReady — Pods are created sequentially (0, then 1, then 2), each waiting for the previous Pod to be Running and Ready.

For applications that don’t require ordered startup, set Parallel:

spec:
  podManagementPolicy: Parallel
  replicas: 3

With Parallel, all Pods are created simultaneously, like a Deployment. This reduces startup time but removes the ordering guarantee. Use Parallel when Pods are independent — for example, a StatefulSet used purely for stable storage identifiers without inter-Pod dependencies.

StatefulSet Update Strategies

StatefulSets support two update strategies:

RollingUpdate (default). Pods are updated in reverse ordinal order (2, 1, 0). Each Pod is terminated and recreated before moving to the next. The partition parameter can restrict the update to a subset of Pods — only Pods with an ordinal greater than or equal to the partition value are updated. This enables canary deployments:

spec:
  updateStrategy:
    type: RollingUpdate
    rollingUpdate:
      partition: 2   # Only update Pod web-2

OnDelete. Pods are not automatically updated. You manually delete Pods, and the StatefulSet controller recreates them with the new spec. This gives full control over the update order:

spec:
  updateStrategy:
    type: OnDelete

Scaling a StatefulSet

Scale up:

kubectl scale statefulset web --replicas=5

New Pods are created in order: web-3, then web-4. Each waits for the previous to be Ready (unless podManagementPolicy: Parallel).

Scale down:

kubectl scale statefulset web --replicas=2

Pods are removed in reverse order: web-4, then web-3, then web-2. Their PVCs remain — data is preserved even after scale-down.

DaemonSets

One Pod Per Node

A DaemonSet ensures that exactly one copy of a Pod runs on every node (or a selected subset of nodes) in the cluster. When a new node joins the cluster, the DaemonSet controller schedules a Pod on it. When a node is removed, the Pod is garbage collected.

This differs fundamentally from Deployments and StatefulSets, which manage a fixed replica count distributed across available nodes by the scheduler. A DaemonSet’s replica count is determined by the number of matching nodes, not by a replicas field.

Use Cases

DaemonSets are the standard pattern for cluster-wide infrastructure agents:

Agent TypeExamples
Log collectionFluentd, Fluent Bit, Filebeat
MonitoringPrometheus Node Exporter, Datadog agent
NetworkingCalico, Cilium, kube-proxy
StorageCSI node drivers, local-volume-provisioner
SecurityFalco, Twistlock defenders

These agents need to run on every node because they collect node-level data (logs, metrics, network packets) or provide node-level services (network routing, volume mounting).

Complete DaemonSet YAML

apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: log-collector
  labels:
    app: log-collector
spec:
  selector:
    matchLabels:
      app: log-collector
  template:
    metadata:
      labels:
        app: log-collector
    spec:
      containers:
        - name: fluentd
          image: fluentd:v1.16
          resources:
            limits:
              cpu: 200m
              memory: 200Mi
            requests:
              cpu: 100m
              memory: 100Mi
          volumeMounts:
            - name: varlog
              mountPath: /var/log
              readOnly: true
            - name: containers
              mountPath: /var/lib/docker/containers
              readOnly: true
      volumes:
        - name: varlog
          hostPath:
            path: /var/log
        - name: containers
          hostPath:
            path: /var/lib/docker/containers

Key observations:

  • No replicas field. The number of Pods is determined by the number of matching nodes.
  • hostPath volumes. DaemonSets commonly mount host directories because they need access to node-level data. This is a legitimate use of hostPath — unlike in Deployments, each DaemonSet Pod runs on a unique node, so there’s no contention.
  • Resource limits. Essential for DaemonSet Pods to prevent a logging agent from consuming all CPU or memory on a node.

Node Selection with nodeSelector

By default, a DaemonSet runs on every node, including control plane nodes (if they have no taints preventing it). To restrict a DaemonSet to specific nodes, use nodeSelector:

spec:
  template:
    spec:
      nodeSelector:
        node-role.kubernetes.io/worker: ""

This runs the DaemonSet Pod only on nodes labeled node-role.kubernetes.io/worker. Use this to exclude control plane nodes or target nodes with specific hardware (GPU nodes, SSD nodes).

Verify which nodes are running DaemonSet Pods:

kubectl get pods -l app=log-collector -o wide
NAME                  READY   STATUS    NODE
log-collector-abc12   1/1     Running   worker-1
log-collector-def34   1/1     Running   worker-2
log-collector-ghi56   1/1     Running   worker-3

Tolerations for Control Plane Nodes

Control plane nodes typically have a taint that prevents regular Pods from scheduling on them:

node-role.kubernetes.io/control-plane:NoSchedule

If you need a DaemonSet Pod on control plane nodes (e.g., for monitoring), add a toleration:

spec:
  template:
    spec:
      tolerations:
        - key: node-role.kubernetes.io/control-plane
          operator: Exists
          effect: NoSchedule

DaemonSet Update Strategies

DaemonSets support two update strategies:

RollingUpdate (default). Pods are updated one node at a time. The maxUnavailable parameter controls how many Pods can be down simultaneously:

spec:
  updateStrategy:
    type: RollingUpdate
    rollingUpdate:
      maxUnavailable: 1

OnDelete. Pods are not automatically updated. You manually delete Pods on specific nodes, and the DaemonSet controller recreates them with the new spec.

Checking DaemonSet Status

kubectl get daemonset log-collector
NAME            DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR   AGE
log-collector   3         3         3       3             3           <none>          5d

Key columns:

  • DESIRED — number of nodes that should run the Pod
  • CURRENT — number of Pods created
  • READY — number of Pods in Ready state
  • UP-TO-DATE — number of Pods running the latest spec
  • AVAILABLE — number of Pods available (matching minReadySeconds)

If DESIRED ≠ CURRENT, some nodes may have taints the DaemonSet doesn’t tolerate. If CURRENT ≠ READY, Pods may be failing their health checks.

StatefulSet vs Deployment vs DaemonSet

CharacteristicDeploymentStatefulSetDaemonSet
Pod namesRandom hashOrdinal (web-0, web-1)Random hash per node
Pod identityInterchangeableStable, uniqueOne per node
Scalingreplicas fieldreplicas fieldNumber of nodes
StorageShared PVC or nonePer-Pod PVC (volumeClaimTemplates)Typically hostPath
Startup orderArbitrarySequential (by default)Arbitrary (per node)
Shutdown orderArbitraryReverse sequentialArbitrary
DNSService load-balancingPer-Pod DNS via headless ServiceN/A (node-level)
Use caseStateless appsDatabases, caches, consensusNode agents, log/metrics

Decision rule: If Pods are interchangeable, use a Deployment. If Pods need stable identity and storage, use a StatefulSet. If you need one Pod per node, use a DaemonSet.

Exercises

Exercise 1: Helm Chart Installation with Custom Values

Requirements:

  1. Add the Bitnami Helm repository
  2. Install the bitnami/nginx chart as a release named my-web in namespace helm-exercise with:
    • 2 replicas
    • Service type ClusterIP
  3. Verify the release is deployed and the Pods are running
  4. Upgrade the release to 3 replicas
  5. Roll back to the original 2-replica configuration
  6. Verify the rollback was successful

Exercise 2: StatefulSet with Headless Service

Requirements:

  1. Create a namespace stateful-exercise
  2. Create a headless Service named web-headless in stateful-exercise with:
    • No cluster IP (clusterIP: None)
    • Selector: app: web
    • Port 80
  3. Create a StatefulSet named web in stateful-exercise with:
    • 3 replicas
    • serviceName: web-headless
    • Container: nginx:1.25
    • A volumeClaimTemplate named html requesting 100Mi storage
    • Mount the volume at /usr/share/nginx/html
  4. Verify Pods are named web-0, web-1, web-2
  5. Verify each Pod has its own PVC: html-web-0, html-web-1, html-web-2
  6. Write a unique file to each Pod’s volume and verify it persists after Pod deletion:
    kubectl exec web-0 -n stateful-exercise -- sh -c 'echo "pod-0-data" > /usr/share/nginx/html/index.html'
  7. Delete web-0 and verify the replacement Pod retains the data
  8. Verify stable DNS resolution:
    kubectl run dns-test --rm -it --image=busybox -n stateful-exercise -- nslookup web-0.web-headless

Exercise 3: DaemonSet with Node Selection

Requirements:

  1. Create a namespace daemon-exercise
  2. Label one of your nodes with disk=ssd:
    kubectl label node <node-name> disk=ssd
  3. Create a DaemonSet named node-monitor in daemon-exercise with:
    • Container: busybox:1.36
    • Command: ["sh", "-c", "while true; do echo $(hostname) $(date); sleep 60; done"]
    • nodeSelector: { disk: ssd }
    • Resource requests: 50m CPU, 64Mi memory
  4. Verify the DaemonSet only runs on the labeled node(s)
  5. Label a second node with disk=ssd and verify a new Pod appears automatically
  6. Remove the label from one node and verify the Pod is removed:
    kubectl label node <node-name> disk-

Solutions are provided in the next chapter.