The Pod: Not Only A Group Of Whales

The Pod: Not Only A Group Of Whales

In the whale world, a Pod is a group of whales, and carrying on Docker’s whale theme, Kubernetes calls a group of containers a Pod. Pods are the smallest artifacts that can be deployed to a Kubernetes cluster, and in this blog post, we’ll get to know their fundamentals.

In this blog post, we’ll deploy our first workload to a hosted Kubernetes cluster using an abstraction called a Pod.

In Kubernetes, a Pod is the smallest deployable unit to run workloads. A Pod can contain multiple containers, but it cannot be divided further into its individual containers. Thus, a Pod is like a bounded execution context for a set of containers in which they share certain resources like the network namespace, but are completely isolated from other execution contexts. It follows from this that a Pod’s containers are always located on the same machine.

Pods vs. Containers

The fact that Pods are the smallest deployable unit might seem surprising at first since we’ve established previously that containers – not Pods! – provide the level of encapsulation we need to handle the complexity of a large, monolith-based application by decomposing it into smaller, more manageable pieces of functionality called microservices. That’s true as far as decoupling cohesive sets of responsibilities is concerned, but consider a case where an application does not represent a piece of functionality in the business domain, but provides supporting functionality to one that does – for example, think of a main application that serves some website to clients, and a supporting application synchronizing the site’s sources from a version control system. In such cases, it could be problematic to run the two applications on different machines. This is the issue that Pods address by being able to run multiple containers.

Interestingly, running multiple containers in one Pod has long been considered bad practice, but over time, running a sidecar container along with a main container providing the latter with supporting functionality has become a lot more common. An important feature of Pods to enable this sidecar pattern is the fact that each container can still specify its own resource requirements – for example, in case of the sidecar pattern, one would typically make sure the main container has enough CPU and memory to satisfy requests, while the sidecar container would be assigned much less resources. We’ll see how to specify such resources a bit further down the line.

Running A Pod Manually

In a previous blog post, we’ve run a simple application inside a Docker container, and running the same application in a Kubernetes cluster is as easy as issuing the following kubectl command:

$ kubectl run hello-pod --image=antsinmyey3sjohnson/hello-container-service:1.0
pod/hello-pod created

This will launch a Pod for us in Kubernetes’ default namespace, and we can use the following commands to investigate it:

$ kubectl get pod
NAME        READY   STATUS    RESTARTS   AGE
hello-pod   1/1     Running   0          6s

$ kubectl logs hello-pod
Starting to serve on port 8081

“But wait”, you’ll object after the previous blog post, “this not a declarative description of desired state!”. And you’re right! The above is a simple means to launch a Pod (convenient for launching a debug Pod, for example) and it gets the job done, but it also represents a (small) set of updates we as human operators have made to shift from a current to a desired state. By means of its declarative configuration approach, Kubernetes can prevent precisely that, and because this approach makes Kubernetes so powerful, we should make use of it by specifying a Pod manifest.

The Pod Manifest

The Pod manifest is the piece of declarative configuration that describes the desired state for a Pod. The most basic manifest to run the hello-container-service application inside a Pod looks like so:

apiVersion: v1
kind: Pod
metadata:
  name: hello-pod
  labels:
    name: hello-pod
spec:
  containers:
  - name: hello-container-service
    image: antsinmyey3sjohnson/hello-container-service:1.0
    ports:
      - containerPort: 8081

Here, the metadata section describes the Pod’s name and its labels, and the spec.containers section provides information about which container image to run, which ports to assign it to, and how to name the container. This manifest embodies the declarative description of desired state for a Pod called hello-pod, and we can hand it to the Kubernetes API server using the kubectl apply command like so (assuming the manifest is called hello_pod.yml and resides in the current working directory):

$ kubectl apply -f hello_pod.yml 
pod/hello-pod created

$ kubectl get pod
NAME        READY   STATUS    RESTARTS   AGE
hello-pod   1/1     Running   0          66s

The Kubernetes API server will parse the YAML structure and pass it on to a scheduler, which will schedule the Pod to a healthy node in the cluster.

Specifying Resources

I’ve mentioned earlier in this article that one of Kubernetes’ features allowing users to put multiple containers in a Pod is that users can specify resource requirements individually for each container. This feature is important for two reasons:

  • Different containers in a Pod will often have varying resource requirements, so it’s necessary to be able to scale their resource allocations independently.
  • Resource specifications are at the very foundation of Kubernetes’ ability to distribute workloads onto the various nodes of a cluster and thus ensuring utilization of each node is as high as possible. Without knowing which resources a workload requires, Kubernetes can’t make reasonable scheduling decisions.

The most common kind of resource specified on containers is CPU and memory, and a container can carry two different resource specifications: resource requests and resource limits.

Resource Requests

Resource requests allow users to configure the amount of resources a workload requires at minimum, and the Kubernetes scheduler will make sure a Pod is scheduled only to a node that satisfies the total amount of all minimum resources specified by each container in the Pod – if the sum of resource requests across all Pods already running on a machine plus those of a yet-to-be-scheduled Pod exceeds the total capacity of the node, then Kubernetes will not schedule the Pod onto that node.

To let Kubernetes know about the minimum amount of resources required to run the hello-container-service container, we can extend the container’s specification like so:

  # More stuff above... truncated
  containers:
  - name: hello-container-service
    image: antsinmyey3sjohnson/hello-container-service:1.0
    ports:
      - containerPort: 8081
    resources:
        requests:
          cpu: "100m"
          memory: "64Mi"

This tells Kubernetes the container requires at least one tenth a CPU core and 64 MBs of memory to run, and because it’s the only container in our sample Pod, the Pod will request only that amount of CPU and memory from the Kubernetes scheduler. The Pod will still consume much more CPU and memory, though, if it’s scheduled to a node that still has more capacity available. To cap resource usage, we need to use resource limits.

Resource Limits

Limits allow users to configure the maximum amount of resources a workload is allowed to consume. Just like with requests, a Pod’s total limit is the sum across all limits of its containers. We can add the following to the container specification in the Pod manifest to specify limits for the container to go along with the requests:

  # Truncated
  containers:
  - name: hello-container-service
    image: antsinmyey3sjohnson/hello-container-service:1.0
    ports:
      - containerPort: 8081
    resources:
        requests:
          cpu: "100m"
          memory: "64Mi"
        limits:
          cpu: "200m"
          memory: "128Mi"

If the Pod exceeded the memory limit of 128 MBs configured by its only container, Kubernetes would kill it and spawn another one (though this restart behavior can be modified by means of the restartPolicy field in the Pod spec, which we haven’t covered here). Enforcing the CPU limit, on the other hand, is done by the kernel of the underlying node OS, which will guarantee the Pod will never exceed its CPU limit.

Both requests and limits server important purposes – the former allow Kubernetes to make reasonable scheduling decisions, whereas the latter ensure a single Pod will never eat up all the capacity available on the node it’s running on. Therefore, a Pod’s container specifications should always configure both (in fact, if one of them misses either one or the other, some IDEs will display a warning message).

Accessing A Workload

At some point, you’ll probably want to access a workload running in a Pod. For now, we’ll use a very simple approach to access our previously deployed workload from within the cluster – we’ll establish a connection from localhost into the cluster.

The piece in the Pod manifest allowing us to do that is the containerPort configuration, which is specified on a per-container basis. Here, we’ve configured a container port of 8081, and so from within the cluster, the container’s application is available in the Pod on port 8081. The following lets us forward that port to localhost:

$ kubectl port-forward hello-pod 8081:8081
Forwarding from 127.0.0.1:8081 -> 8081
Forwarding from [::1]:8081 -> 8081

This command forwards the local port 8081 directly to port 8081 on the Pod running in the cluster. With that command still running, we can very simply query the endpoint exposed by the container’s application:

$ curl localhost:8081
{"message": "Hello, container!"}

# Meanwhile, in the other command's output:
$ kubectl port-forward hello-pod 8081:8081
Forwarding from 127.0.0.1:8081 -> 8081
Forwarding from [::1]:8081 -> 8081
Handling connection for 8081

This simple approach may often be sufficient for testing purposes, but it’s easy to see its limitations. Kubernetes offers dedicated API objects to allow for very sophisticated access patterns – such as load-balancing requests across a set Pods running the same application –, and they will be covered in future content.

Wrap-Up

Pods are the atomic unit of work in Kubernetes, and the piece of declarative configuration used to let Kubernetes know about a Pod’s desired state is called the Pod manifest. A Pod is like a bounded execution context for one or more containers, and the containers in one Pod will always be scheduled onto the same node. To allow the Kubernetes scheduler to make reasonable decisions about which node to place a Pod on, each of a Pod’s containers should specify the minimum amount of resources in terms of CPU and memory it needs to run, and such resource requests should always be complemented by resource limits to prevent Pods from consuming more resources than they need. A simple means to access a Pod’s workload is to use the kubectl port-forward command, which will forward a Pod’s port to localhost.

Before we can talk about Kubernetes Services – the API objects enabling more elegant and sophisticated ways to access the Pods running on a Kubernetes cluster –, we need to know one more basic building block, namely, Labels. Therefore, the next blog post will introduce you to Labels and what you can do with them.