K8s CronJobs with execution timeout

K8s-native CronJobs are quite convenient to run regularly scheduled tasks. But K8s CronJob and Job specs does not provide a straight-forward way (at least not that I could find) to specify an execution timeout. So when execution hangs, whatever the reason, container continues running. Best case scenario, util next execution, if concurrencyPolicy: Replace is used.
If your task's code has it's own timeout capability - life is good. When it does not, here's what you can do.
When running a task you'd rather not delay until next try in case of hangup and/or job history needs to be retained via concurrencyPolicy: Forbid, livenessProbe could be used to compare time elapsed since start of the task and timeout value. When that probe fails, container is restarted thanks to restartPolicy: OnFailure.

If job history does not need to be retained, one could use concurrencyPolicy: Replace. However, that will make successfulJobsHistoryLimit and failedJobsHistoryLimit meaningless, as jobs will be replaced each time CronJob schedule kicks off another one.

Perhaps Downward API can be used to get container start time, but I haven't found the right reference for that yet.

I like to be able to see what went wrong in failed job runs. Counterintuitively, using restartPolicy: Never will keep failed pods around, and available to examine.

CronJob with timeout via livenessProbe example

Menu

K8s CronJob with execution timeout