Kubernetes overview and core concepts
While it is feasible to deploy and manage the life cycle of a small number of containers and containerized applications directly in a compute environment, it can get very challenging when you have a large number of containers to manage and orchestrate across a large number of servers. This is where Kubernetes comes in. Initially released in 2014, Kubernetes (K8s) is an open source system for managing containers at scale on clusters of servers (the abbreviation K8s is derived by replacing ubernete with the digit 8).
Architecturally, Kubernetes operates a master node and one or more worker nodes in a cluster of servers. The master node, also known as the control plane, is responsible for the overall management of the cluster, and it has four key components:
- API server
- Scheduler
- Controller
- etcd
The master node exposes an API server layer that allows programmatic control of the cluster. An example of an API call could be the deployment of a web application on the cluster. The control plane also tracks and manages all configuration data in a key-value store called etcd that is responsible for storing all the cluster data, such as the desired number of container images to run, compute resource specification, and size of storage volume for a web application running on the cluster. Kubernetes uses an object type called controller to monitor the current states of Kubernetes resources and take the necessary actions (for example, request the change via the API server) to move the current states to the desired states if there are differences (such as the difference in the number of the running containers) between the two states. The controller manager in the master node is responsible for managing all the Kubernetes controllers. Kubernetes comes with a set of built-in controllers such as scheduler, which is responsible for scheduling Pods (units of deployment that we will discuss in more detail later) to run on worker nodes when there is a change request. Other examples include Job controller, which is responsible for running and stopping one or more Pods for a task, and Deployment controller, which is responsible for deploying Pods based on a deployment manifest, such as a deployment manifest for a web application. The following figure (Figure 6.2) shows the core architecture components of a Kubernetes cluster:
To interact with a Kubernetes cluster control plane, you can use the kubectl
command-line utility, the Kubernetes Python client (https://github.com/kubernetes-client/python), or access directly using the RESTful API. You can get a list of supported kubectl
commands at https://kubernetes.io/docs/reference/kubectl/cheatsheet/.
There are a number of unique technical concepts that are core to the Kubernetes architecture. The following are some of the main concepts that Kubernetes operates around:
- Namespaces: Namespaces organize clusters of worker machines into virtual sub-clusters. They are used to provide logical separation of resources owned by different teams and projects while still allowing ways for different namespaces to communicate. A namespace can span multiple worker nodes, and it can be used to group a list of permissions under a single name to allow authorized users to access resources in a namespace. Resource usage controls can be enforced to namespaces such as quotas for CPU and memory resources. Namespaces also make it possible to name resources with identical names if the resources reside in the different namespaces to avoid naming conflicts. By default, there is a
default
namespace in Kubernetes. You can create additional namespaces as needed. The default namespace is used if a namespace is not specified. - Pods: Kubernetes deploys computing in a logical unit called a Pod. All Pods must belong to a Kubernetes namespace (either the default namespace or a specified namespace). One or more containers can be grouped into a Pod, and all containers in the Pod are deployed and scaled together as a single unit and share the same context, such as Linux namespaces and filesystems. Each Pod has a unique IP address that's shared by all the containers in a Pod. A Pod is normally created as a workload resource, such as a Kubernetes Deployment or Kubernetes Job.
The preceding figure (Figure 6.3) shows the relationship between namespaces, Pods, and containers in a Kubernetes cluster. In this figure, each namespace contains its own set of Pods and each Pod can contain one or more containers running in it.
- Deployment: A deployment is used by Kubernetes to create or modify Pods that run containerized applications. For example, to deploy a containerized application, you create a configuration manifest file (usually in a
YAML
file format) that specifies details, such as the container deployment name, namespaces, container image URI, number of Pod replicas, and the communication port for the application. After the deployment is applied using a Kubernetes client utility (kubectl
), the corresponding Pods running the specified container images will be created on the worker nodes. The following example creates a deployment of Pods for anNginx
server with the desired specification:apiVersion: apps/v1 # k8s API version used for creating this deployment kind: Deployment # the type of object. In this case, it is deployment metadata: name: nginx-deployment # name of the deployment spec: selector: matchLabels: app: nginx # an app label for the deployment. This can be used to look up/select Pods replicas: 2 # tells deployment to run 2 Pods matching the template template: metadata: labels: app: nginx spec: containers: - name: nginx image: nginx:1.14.2 # Docker container image used for the deployment ports: - containerPort: 80 # the networking port to communicate with the containers
The following figure shows the flow of applying the preceding deployment manifest file to a Kubernetes cluster and creates two Pods to host two copies of the Nginx
container:
After the deployment, a Deployment controller monitors the deployed container instances. If an instance goes down, the controller will replace it with another instance on the worker node.
- Kubernetes Job: A Kubernetes Job is a controller that creates one or more Pods to run some tasks, and ensures the job is successfully completed. If a number of Pods fail due to node failure or other system issues, a Kubernetes Job will recreate the Pods to complete the task. A Kubernetes Job can be used to run batch-oriented tasks, such as running batch data processing scripts, ML model training scripts, or ML batch inference scripts on a large number of inference requests. After a job is completed, the Pods are not terminated, so you can access the job logs and inspect the detailed status of the job. The following is an example template for running a training job:
apiVersion: batch/v1 kind: Job # indicate that his is the Kubernetes Job resource metadata: name: train-job spec: template: spec: containers: - name: train-container imagePullPolicy: Always # tell the job to always pulls a new container image when it is started image: <uri to Docker image containing training script> command: ["python3", "train.py"] # tell the container to run this command after it is started restartPolicy: Never backoffLimit: 0
- Kubernetes custom resources (CRs) and operators: Kubernetes provides a list of built-in resources, such as Pods or deployment for different needs. It also allows you to create CRs and manage them just like the built-in resources, and you can use the same tools (such as
kubectl
) to manage them. When you create the custom resource (CR) in Kubernetes, Kubernetes creates a new API (for example,<custom resource name>/<version>
) for each version of the resource. This is also known as extending the Kubernetes APIs. To create a CR, you create a custom resource definition (CRD)YAML
file. To register the CRD in Kubernetes, you simply runkubectl apply -f <name of the CRD yaml file>
to apply the file. And after that, you can use it just like any other Kubernetes resource. For example, to manage a custom model training job on Kubernetes, you can define a CRD with specifications such as algorithm name, data encryption setting, training image, input data sources, number of job failure retries, number of replicas, and job liveness probe frequency.
A Kubernetes operator is a controller that operates on a custom resource. The operator watches the CR types and takes specific actions to make the current state match the desired state, just like what a built-in controller does. For example, if you want to create a training job for the training job CRD mentioned previously, you create an operator that monitors training job requests and performs application-specific actions to start up the Pods and run the training job throughout the life cycle. The following figure (Figure 6.5) shows the components involved with an operator deployment:
The most common way to deploy an operator is to deploy a CR definition and the associated controller. The controller runs outside of the Kubernetes control plane, similar to running a containerized application in a Pod.