Kubernetes Jobs and Cron Jobs

Kubernetes Jobs and Cron Jobs
Most of the applications that run on a distributed system like Kubernetes are always live like web servers or databases or API servers. But there are a seperate class of objects that are meant to run once or only wake up every once a while and run their course. Periodic jobs like TLS certificate renewals with agents like Certbot are classic example of such jobs running on traditional servers. These are done using the Cron utility in Unix systems.

Kubernetes has an analogous way of running one time processes Jobs and periodic processes like cron jobs.

We will start with a typical example of what Jobs are and demonstrate a standard example from the official Docs. From this example it will be easy to understand what it means by running a Job successfully in Kubernetes’ context.

To follow along, I would recommend you to use Kataconda Playground for Kubernetes which will provide an out of the box Kubernetes cluster without you having to manually configure one or risking a production cluster for experiments.

Kubernetes Jobs

Jobs are higher level Kubernetes abstractions, similar to ReplicaSets and Deployments. But unlike pods managed by deployments and ReplicaSets, pods carrying out a Job complete their work and exit.

When a specified number of pods reach complete, the Job is said to have successfully completed. What are the criteria which define a successful termination of a pod is something we will define in the Job’s YAML file. Then the Job controller will ensure that a certain number of pods have successfully terminated and the Job is said to be complete.

Let’s create a job that prints digits of pi up to 2000 places in its logs which we will examine. Create a file and call it my-job.yaml and save the following contents in it;

apiVersion: batch/v1
kind: Job
metadata:
name: pi
spec:
template:
spec:
containers:
– name: pi
image: perl
command: ["perl""-Mbignum=bpi", "-wle", "print bpi(2000)"]
restartPolicy: Never
backoffLimit: 4

Create the job, using this file:

$ kubectl create -f ./job.yaml

You will notice that the job with take a few seconds to a couple of minutes to run and once it is done. When you try listing all the pods using:

$ kubectl get pods

NAME       READY     STATUS      RESTARTS   AGE
pi-wg6zp   0/1       Completed   0          50s

You will see that the Status of the pi related pod is Completed not running or terminated.You can also copy of the name of the pod so we can verify that pi has indeed been calculated to 2000 digits. The specific name of the pod may differ in your case.

$ kubectl logs pi-wg6zp

Interestingly enough, the pod has not Terminated it is still very much active, just that there are no applications running inside it. Similar to just turning on your computer and not using it. If the pod was terminated we would not have been able to pull the logs from it, in the first place.

To clean up the job and all the pods that were created, run the command:

$ kubectl delete -f my-jobs.yaml

You can learn more about the Job specifications and how to write your specification in the official documentation.

Cron Jobs

Cron Jobs are similar to the Cron utility in Unix that runs periodically according to a schedule that we desire. It is not a superstable thing in Kubernetes, at the time of this writing, so you might want to be careful using. To quote the official docs:

“A cron job creates a job object about once per execution time of its schedule. We say “about” because there are certain circumstances where two jobs might be created, or no job might be created. We attempt to make these rare, but do not completely prevent them. Therefore, jobs should be idempotent

The term idempotent means that the Cron Job whether performed once or twice or any number of time would have the same effect on the system. Checking for updates, monitoring those kind of operations can be considered idempotent. But modifying data, or writing to a database are not among these.

Let’s write a cron job that would write a “Hello, World!” message in its logs along with a timestamp of when that message was written. Create file called my-cronjob.yaml and to it write the following contents:

apiVersion: batch/v1beta1
kind
: CronJob
metadata
:
name
: my-cronjob
spec
:
schedule
: "*/1 * * * *"
jobTemplate
:
spec
:
template
:
spec
:
containers
:
– name
: hello
image
: busybox
args
:
– /bin/sh
– -c
– date; echo Hello from the Kubernetes cluster
restartPolicy
: OnFailure

The schedule part of the job is the most crucial one. It follows the standard Cron convention, there are a list of numbers separated by spaces. The five numbers represent,

  1. Minute (0-59)
  2. Hour (0-23)
  3. Day of the Month (1-31)
  4. Month (1-12)
  5. Day of the week (0-6) starting from Sunday

Using asterisk (*) for a field means any available value of that field (like a wildcard) and the first entry in our schedule “*/1 * * * *” indicated that the job must be run every minute regardless of the hour, day or month of the year. Using */5 will print the message every 5 minutes.

You can learn more about the cronjob yaml specification in the official docs. Let’s see all the pods running for the job, which we named my-cronjob.

$ kubectl get pods
NAME                          READY     STATUS      RESTARTS   AGE
my-cronjob-1534457100-hfhzf   0/1       Completed   0          2m
my-cronjob-1534457160-gk85l   0/1       Completed   0          1m
my-cronjob-1534457220-bj22x   0/1       Completed   0          57s

Digging into the logs of each of the pods would reveal a single message with a timestamp, since they were all created at different times, they will all have different timestamps.

$ kubectl log my-cronjob-1534457100-hfhzf

To delete the cronjob simply run:

$ kubectl delete -f my-cronjob.yaml

This will also delete any pods that were created in the due process.

References

You can learn more about Kubernetes Jobs here and for Cron jobs you can visit this section of their well-structured documentation.

Related Posts
Leave a Reply

Your email address will not be published.Required fields are marked *