CronJob Scheduling and Failure Handling

CronJobs: A Factory for Jobs

A CronJob does not run containers. It creates Jobs on a schedule. Every time the cron timer fires, the CronJob controller spawns a new Job object, which then creates its own Pods to execute the workload. The relationship is hierarchical:

CronJob → Job → Pod(s)

This layered design means everything you learned about Jobs in the previous section — completions, parallelism, backoff limits, TTL cleanup — applies directly to the Jobs that a CronJob creates. The CronJob adds a scheduling layer on top, controlling when and how often those Jobs are spawned.

Creating a CronJob Imperatively

The imperative command mirrors the Job creation syntax, with a --schedule flag:

kubectl create cronjob daily-report --image=reporter:1.0 --schedule="0 2 * * *"

This creates a CronJob named daily-report that fires every day at 2:00 AM. The schedule follows standard cron syntax.

To add a container command:

kubectl create cronjob log-cleanup --image=busybox --schedule="0 */6 * * *" -- sh -c "echo Cleaning logs && sleep 10"

Generate YAML for further editing:

kubectl create cronjob daily-report --image=reporter:1.0 --schedule="0 2 * * *" --dry-run=client -o yaml

Cron Schedule Syntax

The schedule field uses the standard five-field cron format:

┌───────────── minute (0-59)
│ ┌───────────── hour (0-23)
│ │ ┌───────────── day of month (1-31)
│ │ │ ┌───────────── month (1-12)
│ │ │ │ ┌───────────── day of week (0-6, Sunday=0)
│ │ │ │ │
* * * * *

Each field accepts:

A specific value: 5 (the fifth minute)
A wildcard: * (every value)
A range: 1-5 (Monday through Friday)
A step: */10 (every tenth value)
A list: 1,15 (first and fifteenth)

Common Patterns

Schedule	Meaning
`/5 * * *`	Every 5 minutes
`0 * * * *`	Every hour, on the hour
`0 /6 * *`	Every 6 hours (midnight, 6am, noon, 6pm)
`0 2 * * *`	Daily at 2:00 AM
`0 0 * * 1`	Every Monday at midnight
`0 0 1 * *`	First day of every month at midnight
`30 8 * * 1-5`	Weekdays at 8:30 AM

A common exam trap: */5 * * * * fires every 5 minutes, not every 5 hours. The position of the field determines the unit. The first field is minutes, so */5 in the first position means “every five minutes.”

Another trap: forgetting that hours use 24-hour format. 0 14 * * * fires at 2:00 PM, not AM. There is no PM notation in cron.

Concurrency Policy

When a CronJob fires on schedule, the previous Job might still be running. The .spec.concurrencyPolicy field controls what happens:

Allow (Default)

The CronJob creates a new Job regardless of whether previous Jobs are still running. Multiple Jobs can execute concurrently. This is fine for idempotent tasks where parallel runs don’t interfere with each other.

spec:
  concurrencyPolicy: Allow

Forbid

If the previous Job is still running when the next schedule fires, the CronJob skips the new run entirely. No Job is created. This prevents overlapping executions for tasks that must run exclusively — like a database cleanup that takes a lock.

spec:
  concurrencyPolicy: Forbid

Replace

The CronJob terminates the currently running Job and starts a new one. This is aggressive — the old Job’s Pods are killed. Use this when the latest execution should always supersede a stale one, such as a cache refresh where old data is worthless if new data is available.

spec:
  concurrencyPolicy: Replace

Choosing the right policy depends on the workload. For most tasks on the CKAD, Forbid is the safest choice when the question mentions avoiding overlap.

Starting Deadline and Missed Schedules

The .spec.startingDeadlineSeconds field defines how long after a missed schedule the CronJob controller will still attempt to create the Job. If the controller was down, overloaded, or the schedule was missed for any reason, this field determines whether a late run is acceptable.

spec:
  startingDeadlineSeconds: 200

With this setting, if the CronJob was supposed to fire at 2:00 AM but the controller didn’t process it until 2:03 AM (180 seconds late), the Job is still created because 180 < 200. If the delay exceeds 200 seconds, the run is skipped.

When startingDeadlineSeconds is not set, there is no deadline — missed runs can trigger late, potentially causing a burst of catch-up Jobs. If the CronJob misses 100 or more schedules (and no deadline is set), the controller stops scheduling entirely and logs an error. This is a safety mechanism to prevent runaway Job creation.

History Limits

Every fired Job persists in the cluster after completion (unless TTL cleanup is configured on the Job template). Over time, this builds up. CronJob history limits control how many finished Jobs are retained:

spec:
  successfulJobsHistoryLimit: 3
  failedJobsHistoryLimit: 1

successfulJobsHistoryLimit defaults to 3. The three most recent successful Jobs are kept; older ones are garbage collected.
failedJobsHistoryLimit defaults to 1. Only the most recent failed Job is kept.

Setting either to 0 means no completed/failed Jobs are retained at all. This saves cluster resources but removes the ability to inspect past executions:

# See retained Jobs spawned by the CronJob
kubectl get jobs -l job-name

For debugging, keep at least one failed Job to preserve its logs and events.

Suspending a CronJob

Setting .spec.suspend to true pauses the CronJob. No new Jobs are created when the schedule fires, but existing running Jobs are not affected — they continue until completion.

spec:
  suspend: true

Suspend and resume imperatively:

# Pause
kubectl patch cronjob daily-report -p '{"spec":{"suspend":true}}'

# Resume
kubectl patch cronjob daily-report -p '{"spec":{"suspend":false}}'

This is useful during maintenance windows or when a downstream system is unavailable and running the Job would fail anyway.

Complete CronJob YAML Example

Here’s a production-grade CronJob combining the fields discussed:

apiVersion: batch/v1
kind: CronJob
metadata:
  name: daily-report
spec:
  schedule: "0 2 * * *"
  concurrencyPolicy: Forbid
  startingDeadlineSeconds: 300
  successfulJobsHistoryLimit: 3
  failedJobsHistoryLimit: 2
  suspend: false
  jobTemplate:
    spec:
      backoffLimit: 3
      activeDeadlineSeconds: 1800
      ttlSecondsAfterFinished: 86400
      template:
        spec:
          containers:
            - name: reporter
              image: reporter:1.0
              command:
                - sh
                - -c
                - |
                  echo "Starting daily report generation"
                  generate-report --output /tmp/report.csv
                  echo "Report complete"
              resources:
                requests:
                  cpu: "100m"
                  memory: "128Mi"
                limits:
                  cpu: "500m"
                  memory: "512Mi"
          restartPolicy: OnFailure

Notice the nesting depth: CronJob.spec.jobTemplate.spec.template.spec.containers. This three-level nesting (CronJob → Job → Pod) is the most common source of YAML indentation errors on the exam. The structure reads as:

CronJob spec — schedule, concurrency, history limits
Job spec (inside jobTemplate) — backoff, deadline, TTL
Pod spec (inside template) — containers, restart policy, resources

When building a CronJob from scratch on the exam, generate the YAML imperatively and then add fields at the correct nesting level.

Deleting a CronJob: Garbage Collection

When you delete a CronJob, Kubernetes garbage collects all Jobs and Pods it created. This is owner-reference based — each Job’s metadata.ownerReferences points to the parent CronJob. When the owner is deleted, the dependent objects are removed.

kubectl delete cronjob daily-report

This deletes:

The CronJob object
All Jobs created by it (running or completed)
All Pods belonging to those Jobs

If you need to delete the CronJob but keep the existing Jobs running, use the --cascade=orphan flag:

kubectl delete cronjob daily-report --cascade=orphan

Orphaned Jobs continue running but won’t be cleaned up automatically — you’ll need to manage them manually.

Debugging CronJobs

When a CronJob doesn’t fire as expected, follow this diagnostic sequence:

1. Check the CronJob Status

kubectl get cronjob daily-report

NAME           SCHEDULE    SUSPEND   ACTIVE   LAST SCHEDULE   AGE
daily-report   0 2 * * *   False     0        25h             7d

Key fields:

SUSPEND: If True, the CronJob is paused.
ACTIVE: Number of currently running Jobs. If this stays at 1 and concurrencyPolicy: Forbid is set, new runs are being skipped.
LAST SCHEDULE: When the last Job was created. If this is stale, the schedule might be wrong or the controller is having issues.

2. Check Events

kubectl describe cronjob daily-report

Look at the Events section. Common messages include:

SuccessfulCreate — a Job was spawned normally
MissDeadline — the starting deadline was exceeded
ForbidConcurrent — a run was skipped because the previous Job is still active

3. Inspect the Latest Job

# Find Jobs created by this CronJob
kubectl get jobs -l job-name --sort-by=.metadata.creationTimestamp

# Describe the most recent Job
kubectl describe job daily-report-28489320

Check the Job’s conditions for BackoffLimitExceeded or DeadlineExceeded. If the Job itself failed, the CronJob is working — the problem is in the workload.

4. Check Pod Logs

# Find Pods for a specific Job
kubectl get pods -l job-name=daily-report-28489320

# Check logs
kubectl logs daily-report-28489320-xyz12

Container-level errors (image pull failures, command not found, non-zero exit codes) surface in Pod logs and events.

5. Verify the Schedule

A subtlety that catches people: the CronJob controller uses the kube-controller-manager’s timezone, which defaults to UTC. If you expect a Job to run at 2:00 AM local time but the cluster is set to UTC, adjust accordingly. Kubernetes 1.27+ supports .spec.timeZone to set an explicit timezone:

spec:
  schedule: "0 2 * * *"
  timeZone: "America/New_York"

On the CKAD exam, the cluster typically uses UTC. If the exam question specifies a time, assume UTC unless stated otherwise.

Key Takeaways

A CronJob creates Jobs on a schedule; it does not run Pods directly.
The cron format has five fields: minute, hour, day-of-month, month, day-of-week.
concurrencyPolicy controls overlap: Allow, Forbid, or Replace.
startingDeadlineSeconds defines how late a missed schedule can still trigger.
History limits (successfulJobsHistoryLimit, failedJobsHistoryLimit) control how many past Jobs are retained.
suspend: true pauses future runs without killing active Jobs.
Deleting a CronJob cascades to its Jobs and Pods by default.
Debug with: kubectl get cronjob → describe → inspect Jobs → check Pod logs.