CronJob Scheduling and Failure Handling
SummaryCovers CronJob mechanics: cron schedule syntax with common...
Covers CronJob mechanics: cron schedule syntax with common...
Covers CronJob mechanics: cron schedule syntax with common patterns, imperative creation, concurrencyPolicy (Allow, Forbid, Replace), startingDeadlineSeconds, history limits, suspension, complete YAML examples, garbage collection behavior, and debugging techniques for missed schedules.
CronJob Scheduling and Failure Handling
CronJobs: A Factory for Jobs
A CronJob does not run containers. It creates Jobs on a schedule. Every time the cron timer fires, the CronJob controller spawns a new Job object, which then creates its own Pods to execute the workload. The relationship is hierarchical:
CronJob → Job → Pod(s)
This layered design means everything you learned about Jobs in the previous section — completions, parallelism, backoff limits, TTL cleanup — applies directly to the Jobs that a CronJob creates. The CronJob adds a scheduling layer on top, controlling when and how often those Jobs are spawned.
Creating a CronJob Imperatively
The imperative command mirrors the Job creation syntax, with a --schedule flag:
kubectl create cronjob daily-report --image=reporter:1.0 --schedule="0 2 * * *"
This creates a CronJob named daily-report that fires every day at 2:00 AM. The schedule follows standard cron syntax.
To add a container command:
kubectl create cronjob log-cleanup --image=busybox --schedule="0 */6 * * *" -- sh -c "echo Cleaning logs && sleep 10"
Generate YAML for further editing:
kubectl create cronjob daily-report --image=reporter:1.0 --schedule="0 2 * * *" --dry-run=client -o yaml
Cron Schedule Syntax
The schedule field uses the standard five-field cron format:
┌───────────── minute (0-59)
│ ┌───────────── hour (0-23)
│ │ ┌───────────── day of month (1-31)
│ │ │ ┌───────────── month (1-12)
│ │ │ │ ┌───────────── day of week (0-6, Sunday=0)
│ │ │ │ │
* * * * *
Each field accepts:
- A specific value:
5(the fifth minute) - A wildcard:
*(every value) - A range:
1-5(Monday through Friday) - A step:
*/10(every tenth value) - A list:
1,15(first and fifteenth)
Common Patterns
| Schedule | Meaning |
|---|---|
*/5 * * * * | Every 5 minutes |
0 * * * * | Every hour, on the hour |
0 */6 * * * | Every 6 hours (midnight, 6am, noon, 6pm) |
0 2 * * * | Daily at 2:00 AM |
0 0 * * 1 | Every Monday at midnight |
0 0 1 * * | First day of every month at midnight |
30 8 * * 1-5 | Weekdays at 8:30 AM |
A common exam trap: */5 * * * * fires every 5 minutes, not every 5 hours. The position of the field determines the unit. The first field is minutes, so */5 in the first position means “every five minutes.”
Another trap: forgetting that hours use 24-hour format. 0 14 * * * fires at 2:00 PM, not AM. There is no PM notation in cron.
Concurrency Policy
When a CronJob fires on schedule, the previous Job might still be running. The .spec.concurrencyPolicy field controls what happens:
Allow (Default)
The CronJob creates a new Job regardless of whether previous Jobs are still running. Multiple Jobs can execute concurrently. This is fine for idempotent tasks where parallel runs don’t interfere with each other.
spec:
concurrencyPolicy: Allow
Forbid
If the previous Job is still running when the next schedule fires, the CronJob skips the new run entirely. No Job is created. This prevents overlapping executions for tasks that must run exclusively — like a database cleanup that takes a lock.
spec:
concurrencyPolicy: Forbid
Replace
The CronJob terminates the currently running Job and starts a new one. This is aggressive — the old Job’s Pods are killed. Use this when the latest execution should always supersede a stale one, such as a cache refresh where old data is worthless if new data is available.
spec:
concurrencyPolicy: Replace
Choosing the right policy depends on the workload. For most tasks on the CKAD, Forbid is the safest choice when the question mentions avoiding overlap.
Starting Deadline and Missed Schedules
The .spec.startingDeadlineSeconds field defines how long after a missed schedule the CronJob controller will still attempt to create the Job. If the controller was down, overloaded, or the schedule was missed for any reason, this field determines whether a late run is acceptable.
spec:
startingDeadlineSeconds: 200
With this setting, if the CronJob was supposed to fire at 2:00 AM but the controller didn’t process it until 2:03 AM (180 seconds late), the Job is still created because 180 < 200. If the delay exceeds 200 seconds, the run is skipped.
When startingDeadlineSeconds is not set, there is no deadline — missed runs can trigger late, potentially causing a burst of catch-up Jobs. If the CronJob misses 100 or more schedules (and no deadline is set), the controller stops scheduling entirely and logs an error. This is a safety mechanism to prevent runaway Job creation.
History Limits
Every fired Job persists in the cluster after completion (unless TTL cleanup is configured on the Job template). Over time, this builds up. CronJob history limits control how many finished Jobs are retained:
spec:
successfulJobsHistoryLimit: 3
failedJobsHistoryLimit: 1
successfulJobsHistoryLimitdefaults to 3. The three most recent successful Jobs are kept; older ones are garbage collected.failedJobsHistoryLimitdefaults to 1. Only the most recent failed Job is kept.
Setting either to 0 means no completed/failed Jobs are retained at all. This saves cluster resources but removes the ability to inspect past executions:
# See retained Jobs spawned by the CronJob
kubectl get jobs -l job-name
For debugging, keep at least one failed Job to preserve its logs and events.
Suspending a CronJob
Setting .spec.suspend to true pauses the CronJob. No new Jobs are created when the schedule fires, but existing running Jobs are not affected — they continue until completion.
spec:
suspend: true
Suspend and resume imperatively:
# Pause
kubectl patch cronjob daily-report -p '{"spec":{"suspend":true}}'
# Resume
kubectl patch cronjob daily-report -p '{"spec":{"suspend":false}}'
This is useful during maintenance windows or when a downstream system is unavailable and running the Job would fail anyway.
Complete CronJob YAML Example
Here’s a production-grade CronJob combining the fields discussed:
apiVersion: batch/v1
kind: CronJob
metadata:
name: daily-report
spec:
schedule: "0 2 * * *"
concurrencyPolicy: Forbid
startingDeadlineSeconds: 300
successfulJobsHistoryLimit: 3
failedJobsHistoryLimit: 2
suspend: false
jobTemplate:
spec:
backoffLimit: 3
activeDeadlineSeconds: 1800
ttlSecondsAfterFinished: 86400
template:
spec:
containers:
- name: reporter
image: reporter:1.0
command:
- sh
- -c
- |
echo "Starting daily report generation"
generate-report --output /tmp/report.csv
echo "Report complete"
resources:
requests:
cpu: "100m"
memory: "128Mi"
limits:
cpu: "500m"
memory: "512Mi"
restartPolicy: OnFailure
Notice the nesting depth: CronJob.spec.jobTemplate.spec.template.spec.containers. This three-level nesting (CronJob → Job → Pod) is the most common source of YAML indentation errors on the exam. The structure reads as:
- CronJob spec — schedule, concurrency, history limits
- Job spec (inside
jobTemplate) — backoff, deadline, TTL - Pod spec (inside
template) — containers, restart policy, resources
When building a CronJob from scratch on the exam, generate the YAML imperatively and then add fields at the correct nesting level.
Deleting a CronJob: Garbage Collection
When you delete a CronJob, Kubernetes garbage collects all Jobs and Pods it created. This is owner-reference based — each Job’s metadata.ownerReferences points to the parent CronJob. When the owner is deleted, the dependent objects are removed.
kubectl delete cronjob daily-report
This deletes:
- The CronJob object
- All Jobs created by it (running or completed)
- All Pods belonging to those Jobs
If you need to delete the CronJob but keep the existing Jobs running, use the --cascade=orphan flag:
kubectl delete cronjob daily-report --cascade=orphan
Orphaned Jobs continue running but won’t be cleaned up automatically — you’ll need to manage them manually.
Debugging CronJobs
When a CronJob doesn’t fire as expected, follow this diagnostic sequence:
1. Check the CronJob Status
kubectl get cronjob daily-report
NAME SCHEDULE SUSPEND ACTIVE LAST SCHEDULE AGE
daily-report 0 2 * * * False 0 25h 7d
Key fields:
- SUSPEND: If
True, the CronJob is paused. - ACTIVE: Number of currently running Jobs. If this stays at 1 and
concurrencyPolicy: Forbidis set, new runs are being skipped. - LAST SCHEDULE: When the last Job was created. If this is stale, the schedule might be wrong or the controller is having issues.
2. Check Events
kubectl describe cronjob daily-report
Look at the Events section. Common messages include:
SuccessfulCreate— a Job was spawned normallyMissDeadline— the starting deadline was exceededForbidConcurrent— a run was skipped because the previous Job is still active
3. Inspect the Latest Job
# Find Jobs created by this CronJob
kubectl get jobs -l job-name --sort-by=.metadata.creationTimestamp
# Describe the most recent Job
kubectl describe job daily-report-28489320
Check the Job’s conditions for BackoffLimitExceeded or DeadlineExceeded. If the Job itself failed, the CronJob is working — the problem is in the workload.
4. Check Pod Logs
# Find Pods for a specific Job
kubectl get pods -l job-name=daily-report-28489320
# Check logs
kubectl logs daily-report-28489320-xyz12
Container-level errors (image pull failures, command not found, non-zero exit codes) surface in Pod logs and events.
5. Verify the Schedule
A subtlety that catches people: the CronJob controller uses the kube-controller-manager’s timezone, which defaults to UTC. If you expect a Job to run at 2:00 AM local time but the cluster is set to UTC, adjust accordingly. Kubernetes 1.27+ supports .spec.timeZone to set an explicit timezone:
spec:
schedule: "0 2 * * *"
timeZone: "America/New_York"
On the CKAD exam, the cluster typically uses UTC. If the exam question specifies a time, assume UTC unless stated otherwise.
Key Takeaways
- A CronJob creates Jobs on a schedule; it does not run Pods directly.
- The cron format has five fields: minute, hour, day-of-month, month, day-of-week.
concurrencyPolicycontrols overlap:Allow,Forbid, orReplace.startingDeadlineSecondsdefines how late a missed schedule can still trigger.- History limits (
successfulJobsHistoryLimit,failedJobsHistoryLimit) control how many past Jobs are retained. suspend: truepauses future runs without killing active Jobs.- Deleting a CronJob cascades to its Jobs and Pods by default.
- Debug with:
kubectl get cronjob→describe→ inspect Jobs → check Pod logs.