Using Kubernetes to build the Hue platform
In this section, we will look at various Kubernetes resources and how they can help us build Hue. First, we’ll get to know the versatile kubectl a little better, then we will look at how to run long-running processes in Kubernetes, exposing services internally and externally, using namespaces to limit access, launching ad hoc jobs, and mixing in non-cluster components. Obviously, Hue is a huge project, so we will demonstrate the ideas on a local cluster and not actually build a real Hue Kubernetes cluster. Consider it primarily a thought experiment. If you wish to explore building a real microservice-based distributed system on Kubernetes, check out Hands-On Microservices with Kubernetes: https://www.packtpub.com/product/hands-on-microservices-with-kubernetes/9781789805468.
Using kubectl effectively
kubectl is your Swiss Army knife. It can do pretty much anything around a cluster. Under the hood, kubectl connects to your cluster via the API. It reads your ~/.kube/config
file (by default, this can be overridden with the KUBECONFIG
environment variable or the --kubeconfig
command-line argument), which contains the information necessary to connect to your cluster or clusters. The commands are divided into multiple categories:
- Generic commands: Deal with resources in a generic way:
create
,get
,delete
,run
,apply
,patch
,replace
, and so on - Cluster management commands: Deal with nodes and the cluster at large:
cluster-info
,certificate
,drain
, and so on - Troubleshooting commands:
describe
,logs
,attach
,exec
, and so on - Deployment commands: Deal with deployment and scaling:
rollout
,scale
,auto-scale
, and so on - Settings commands: Deal with labels and annotations:
label
,annotate
, and so on - Misc commands:
help
,config
, andversion
- Customization commands: Integrate the kustomize.io capabilities into kubectl
- Configuration commands: Deal with contexts, switch between clusters and namespaces, set current context and namespace, and so on
You can view the configuration with Kubernetes’ config view
command.
Here is the configuration for my local KinD cluster:
$ k config view
apiVersion: v1
clusters:
- cluster:
certificate-authority-data: DATA+OMITTED
server: https://127.0.0.1:50615
name: kind-kind
contexts:
- context:
cluster: kind-kind
user: kind-kind
name: kind-kind
current-context: kind-kind
kind: Config
preferences: {}
users:
- name: kind-kind
user:
client-certificate-data: REDACTED
client-key-data: REDACTED
Your kubeconfig
file may or may not be similar to the code sample above, but as long as it points to a running Kubernetes cluster, you will be able to follow along. Let’s take an in-depth look into the kubectl manifest files.
Understanding kubectl manifest files
Many kubectl operations, such as create
, require a complicated hierarchical structure (since the API requires this structure). kubectl uses YAML or JSON manifest files. YAML is more concise and human-readable so we will use YAML mostly. Here is a YAML manifest file for creating a pod:
apiVersion: v1
kind: Pod
metadata:
name: ""
labels:
name: ""
namespace: ""
annotations: []
generateName: ""
spec:
...
Let’s examine the various fields of the manifest.
apiVersion
The very important Kubernetes API keeps evolving and can support different versions of the same resource via different versions of the API.
kind
kind
tells Kubernetes what type of resource it is dealing with; in this case, Pod
. This is always required.
metadata
metadata
contains a lot of information that describes the pod and where it operates:
name
: Identifies the pod uniquely within its namespacelabels
: Multiple labels can be appliednamespace
: The namespace the pod belongs toannotations
: A list of annotations available for query
spec
spec
is a pod template that contains all the information necessary to launch a pod. It can be quite elaborate, so we’ll explore it in multiple parts:
spec:
containers: [
...
],
"restartPolicy": "",
"volumes": []
Container spec
The pod spec’s containers
section is a list of container specs. Each container spec has the following structure:
name: "",
image: "",
command: [""],
args: [""],
env:
- name: "",
value: ""
imagePullPolicy: "",
ports:
- containerPort": 0,
name: "",
protocol: ""
resources:
requests:
cpu: ""
memory: ""
limits:
cpu: ""
memory: ""
Each container has an image
, a command that, if specified, replaces the Docker image command. It also has arguments and environment variables. Then, there are of course the image pull policy, ports, and resource limits. We covered those in earlier chapters.
If you want to explore the pod resource, or other Kubernetes resources, further, then the following command can be very useful: kubectl explain
.
It can explore resources as well as specific sub-resources and fields.
Try the following commands:
kubectl explain pod
kubectl explain pod.spec
Deploying long-running microservices in pods
Long-running microservices should run in pods and be stateless. Let’s look at how to create pods for one of Hue’s microservices – the Hue learner – which is responsible for learning the user’s preferences across different domains. Later, we will raise the level of abstraction and use a deployment.
Creating pods
Let’s start with a regular pod configuration file for creating a Hue learner internal service. This service doesn’t need to be exposed as a public service and it will listen to a queue for notifications and store its insights in some persistent storage.
We need a simple container that will run in the pod. Here is possibly the simplest Docker file ever, which will simulate the Hue learner:
FROM busybox
CMD ash -c "echo 'Started...'; while true ; do sleep 10 ; done"
It uses the busybox
base image, prints to standard output Started...
, and then goes into an infinite loop, which is, by all accounts, long-running.
I have built two Docker images tagged as g1g1/hue-learn:0.3
and g1g1/hue-learn:0.4
and pushed them to the Docker Hub registry (g1g1
is my username):
$ docker build . -t g1g1/hue-learn:0.3
$ docker build . -t g1g1/hue-learn:0.4
$ docker push g1g1/hue-learn:0.3
$ docker push g1g1/hue-learn:0.4
Now these images are available to be pulled into containers inside of Hue’s pods.
We’ll use YAML here because it’s more concise and human-readable. Here are the boilerplate and metadata labels:
apiVersion: v1
kind: Pod
metadata:
name: hue-learner
labels:
app: hue
service: learner
runtime-environment: production
tier: internal-service
Next comes the important containers
spec, which defines for each container the mandatory name and image:
spec:
containers:
- name: hue-learner
image: g1g1/hue-learn:0.3
The resources
section tells Kubernetes the resource requirements of the container, which allows for more efficient and compact scheduling and allocations. Here, the container requests 200 milli-cpu units (0.2 core) and 256 MiB (2 to the power of 28 bytes):
resources:
requests:
cpu: 200m
memory: 256Mi
The environment section allows the cluster administrator to provide environment variables that will be available to the container. Here it tells it to discover the queue and the store via DNS. In a testing environment, it may use a different discovery method:
env:
- name: DISCOVER_QUEUE
value: dns
- name: DISCOVER_STORE
value: dns
Decorating pods with labels
Labeling pods wisely is key for flexible operations. It lets you evolve your cluster live, organize your microservices into groups you can operate on uniformly, and drill down on the fly to observe different subsets.
For example, our Hue learner pod has the following labels (and a few others):
runtime-environment : production
tier : internal-service
The runtime-environment
label allows performing global operations on all pods that belong to a certain environment. The tier
label can be used to query all pods that belong to a particular tier. These are just examples; your imagination is the limit here.
Here is how to list the labels with the get pods
command:
$ k get po -n kube-system --show-labels
NAME READY STATUS RESTARTS AGE LABELS
coredns-64897985d-gzrm4 1/1 Running 0 2d2h k8s-app=kube-dns,pod-template-hash=64897985d
coredns-64897985d-m8nm9 1/1 Running 0 2d2h k8s-app=kube-dns,pod-template-hash=64897985d
etcd-kind-control-plane 1/1 Running 0 2d2h component=etcd,tier=control-plane
kindnet-wx7kl 1/1 Running 0 2d2h app=kindnet,controller-revision-hash=9d779cb4d,k8s-app=kindnet,pod-template-generation=1,tier=node
kube-apiserver-kind-control-plane 1/1 Running 0 2d2h component=kube-apiserver,tier=control-plane
kube-controller-manager-kind-control-plane 1/1 Running 0 2d2h component=kube-controller-manager,tier=control-plane
kube-proxy-bgcrq 1/1 Running 0 2d2h controller-revision-hash=664d4bb79f,k8s-app=kube-proxy,pod-template-generation=1
kube-scheduler-kind-control-plane 1/1 Running 0 2d2h component=kube-scheduler,tier=control-plane
Now, if you want to filter and list only the kube-dns pods, type the following:
$ k get po -n kube-system -l k8s-app=kube-dns
NAME READY STATUS RESTARTS AGE
coredns-64897985d-gzrm4 1/1 Running 0 2d2h
coredns-64897985d-m8nm9 1/1 Running 0 2d2h
Deploying long-running processes with deployments
In a large-scale system, pods should never be just created and let loose. If a pod dies unexpectedly for whatever reason, you want another one to replace it to maintain overall capacity. You can create replication controllers or replica sets yourself, but that leaves the door open to mistakes, as well as the possibility of partial failure. It makes much more sense to specify how many replicas you want when you launch your pods in a declarative manner. This is what Kubernetes deployments are for.
Let’s deploy three instances of our Hue learner microservice with a Kubernetes deployment resource:
apiVersion: apps/v1
kind: Deployment
metadata:
name: hue-learn
labels:
app: hue
spec:
replicas: 3
selector:
matchLabels:
app: hue
template:
metadata:
labels:
app: hue
spec:
containers:
- name: hue-learner
image: g1g1/hue-learn:0.3
resources:
requests:
cpu: 200m
memory: 256Mi
env:
- name: DISCOVER_QUEUE
value: dns
- name: DISCOVER_STORE
value: dns
The pod spec is identical to the spec
section from the pod configuration file previously.
Let’s create the deployment and check its status:
$ k create -f hue-learn-deployment.yaml
deployment.apps/hue-learn created
$ k get deployment hue-learn
NAME READY UP-TO-DATE AVAILABLE AGE
hue-learn 3/3 3 3 25s
$ k get pods -l app=hue
NAME READY STATUS RESTARTS AGE
hue-learn-67d4649b58-qhc88 1/1 Running 0 45s
hue-learn-67d4649b58-qpm2q 1/1 Running 0 45s
hue-learn-67d4649b58-tzzq7 1/1 Running 0 45s
You can get a lot more information about the deployment using the kubectl describe
command:
$ k describe deployment hue-learn
Name: hue-learn
Namespace: default
CreationTimestamp: Tue, 21 Jun 2022 21:11:50 -0700
Labels: app=hue
Annotations: deployment.kubernetes.io/revision: 1
Selector: app=hue
Replicas: 3 desired | 3 updated | 3 total | 3 available | 0 unavailable
StrategyType: RollingUpdate
MinReadySeconds: 0
RollingUpdateStrategy: 25% max unavailable, 25% max surge
Pod Template:
Labels: app=hue
Containers:
hue-learner:
Image: g1g1/hue-learn:0.3
Port: <none>
Host Port: <none>
Requests:
cpu: 200m
memory: 256Mi
Environment:
DISCOVER_QUEUE: dns
DISCOVER_STORE: dns
Mounts: <none>
Volumes: <none>
Conditions:
Type Status Reason
---- ------ ------
Available True MinimumReplicasAvailable
Progressing True NewReplicaSetAvailable
OldReplicaSets: <none>
NewReplicaSet: hue-learn-67d4649b58 (3/3 replicas created)
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal ScalingReplicaSet 106s deployment-controller Scaled up replica set hue-learn-67d4649b58 to 3
Updating a deployment
The Hue platform is a large and ever-evolving system. You need to upgrade constantly. Deployments can be updated to roll out updates in a painless manner. You change the pod template to trigger a rolling update fully managed by Kubernetes. Currently, all the pods are running with version 0.3:
$ k get pods -o jsonpath='{.items[*].spec.containers[0].image}' -l app=hue | xargs printf "%s\n"
g1g1/hue-learn:0.3
g1g1/hue-learn:0.3
g1g1/hue-learn:0.3
Let’s update the deployment to upgrade to version 0.4. Modify the image version in the deployment file. Don’t modify labels; it will cause an error. Save it to hue-learn-deployment-0.4.yaml
. Then we can use the kubectl apply
command to upgrade the version and verify that the pods now run 0.4:
$ k apply -f hue-learn-deployment-0.4.yaml
Warning: resource deployments/hue-learn is missing the kubectl.kubernetes.io/last-applied-configuration annotation which is required by kubectl apply. kubectl apply should only be used on resources created declaratively by either kubectl create --save-config or kubectl apply. The missing annotation will be patched automatically.
deployment.apps/hue-learn configured
$ k get pods -o jsonpath='{.items[*].spec.containers[0].image}' -l app=hue | xargs printf "%s\n"
g1g1/hue-learn:0.4
g1g1/hue-learn:0.4
g1g1/hue-learn:0.4
Note that new pods are created and the original 0.3 pods are terminated in a rolling update manner.
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
hue-learn-67d4649b58-fgt7m 1/1 Terminating 0 99s
hue-learn-67d4649b58-klhz5 1/1 Terminating 0 100s
hue-learn-67d4649b58-lgpl9 1/1 Terminating 0 101s
hue-learn-68d74fd4b7-bxxnm 1/1 Running 0 4s
hue-learn-68d74fd4b7-fh55c 1/1 Running 0 3s
hue-learn-68d74fd4b7-rnsj4 1/1 Running 0 2s
We’ve covered how kubectl manifest files are structured and how they can be applied to deploy and update workloads on our cluster. Let’s see how these workloads can discover and call each other via internal services as well as be called from outside the cluster via externally exposed services.