This blog post will introduce you to the Service object, Kubernetes’ way of implementing service discovery and load balancing that is both reliable and easy to use for clients.
- Service Discovery In Kubernetes
- Services 101
- The Endpoints Object
- Cleaning Up
Service Discovery In Kubernetes
We have seen previously that Pods are the basic unit of work in Kubernetes. Each Pod receives its own IP internal to the Kubernetes cluster, but using it to refer to a Pod wouldn’t be a good idea since Kubernetes treats Pods as disposable, ephemeral units (a paradigm that doesn’t work so well for stateful applications, but that’s another story for another day). Thus, Pods can be spawned and destroyed very quickly – take, for example, Horizontal Pod Autoscaling, a feature that increases or decreases the number of replicas based on load. Plus, many applications are used to invoking a remote address by a resolvable name rather than an IP.
The Service object addresses both issues: It is an abstraction layer providing resolvable service names to clients, and it is designed to handle the challenge imposed by the dynamism of a Kubernetes environment. To illustrate this using an example, consider some frontend application that wants to invoke, say, a backend application for extracting text from images by invoking a well-known service name. Since the backend application is stateless – it merely performs the work, but hands back the result without holding or modifying any local state –, the frontend neither cares which replica of the backend application it talks to, nor whether there are multiple replicas to talk to in the first place. The frontend application also shouldn’t be affected if the set of replicas comprising the backend application changes for some reason (e. g. an increase or decrease in the number of replicas) – as long as there is at least one replica, the frontend must be able to consume it. Service objects in Kubernetes enable this kind of decoupling.
Virtual IP And DNS Name
Just like each Pod receives a cluster-internal IP, each Service (with the exclusion of headless Services) receives a cluster-internal virtual IP (it can also be assigned an external IP in addition to its internal IP depending on the Service type). Because this virtual IP is stable, it can easily be given a DNS name – a Service for the previous example’s backend application might be called image-text-extraction, for example, and that name would resolve to the Service’s virtual, stable IP.
The behavior described above is made possible by a component called kube-proxy. This component runs on each node in the cluster, and it is responsible for providing virtual IPs for all Service objects (type ExternalName and headless Services excluded). It does this by monitoring the Kubernetes API server for new Service objects and programming a set of iptables rules in each node’s kernel once a Service has been created. Those rules are responsible for actually forwarding traffic to the endpoints of a Service.
Thus, virtual IPs solve the name problem, but what about identifying the Pods providing the functionality a client wishes to consume?
Defining A Service
To introduce you to the Service object, we’ll use a little example. We’ve already seen elsewhere that Kubernetes relies heavily on the idea that desired state should be described declaratively, so it’s no surprise we’ll set up this example using a description of desired state. I’ve prepared a little Yaml manifest containing everything we need: a Deployment spawning three Pods along with a Service object to query them and a Namespace encapsulating everything. The manifest is available to you over on my GitHub.
Let’s take a look at the Service specification contained there:
apiVersion: v1 kind: Service metadata: name: hello-service namespace: workload-reachability-example spec: type: LoadBalancer selector: app: hello-app ports: - name: greeting port: 8080
As usual, the manifest defines the
kind properties along with some metadata providing a name for the Service and defining the namespace it will live in. More interesting here is the
type: The default is
ClusterIP, which allocates an internal IP, but for you to be able to more conveniently invoke the Service, I’ve given it type LoadBalanacer, which also assigns an external IP – you’ll experience this in action in just a minute.
selector: Used to identify the set of target Pods this Service should send traffic to – more on this in the next section.
ports: The list of ports exposed by this Service. In case this list contains only one element (like here), the port name could be omitted, but it’s good practice to provide it anyway since other Kubernetes objects building on the Service object (such as Ingress) will be able to reference the port by its name rather than the port number, which means you can change the port number without breaking things. It’s important to point out that the way the target port is specified is by means of the
targetPortproperty, but if left unspecified – which is often done for convenience if
targetPortare equal, as is the case above – it will assume the value of
port. In one way or the other, though, the
targetPortproperty has to contain the port number exposed on the container in the Pod(-s) that are supposed to handle the traffic sent to the Service on that port.
There are many more properties available in the Service specification, but the ones listed above will suffice to get us going nicely.
Working With A Service
If you have your own Kubernetes cluster up and running, you can apply the given manifest using the following command:
$ alias k=kubectl $ k apply -f https://raw.githubusercontent.com/AntsInMyEy3sJohnson/blog-examples/master/kubernetes/workload-reachability/simple-deployment-with-service.yaml
(In case you don’t have your own Kubernetes cluster yet, please refer to this section in one of my earlier blog posts to help you get started.)
This will create the Namespace, Deployment, and Service for you (notice the external IP, which is the workings of the LoadBalancer type shortly mentioned above):
$ k -n workload-reachability-example get svc NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE hello-service LoadBalancer 10.43.122.134 10.211.55.6 8080:30798/TCP 3m10s $ k -n workload-reachability-example get deployment NAME READY UP-TO-DATE AVAILABLE AGE hello-app 3/3 3 3 3m36s $ k -n workload-reachability-example get pod NAME READY STATUS RESTARTS AGE hello-app-9c76bdcbf-wr5v7 1/1 Running 0 3m55s hello-app-9c76bdcbf-vdntn 1/1 Running 0 3m55s hello-app-9c76bdcbf-xzmd4 1/1 Running 0 3m55s
Caution: If you use a cloud-based offer to run your cluster (such as Google’s GKE or Amazon’s AWS), it might take a couple of minutes to allocate that external IP for you. As long as it hasn’t been provisioned yet, the colum in question will say
Once you’ve applied the manifest – and your external IP has been provisioned –, you can check everything works as desired by running the following command from your local machine (you don’t need to run it inside a Pod running in the cluster, just make sure to adjust the IP):
$ while true; do curl 10.211.55.6:8080; sleep 2; done Hello from Pod 10.42.0.53! Hello from Pod 10.42.0.51! Hello from Pod 10.42.0.53! Hello from Pod 10.42.0.52! Hello from Pod 10.42.0.51!
What you can see here is that the requests are balanced nicely across the available Pods in a random fashion. The load balancing strategy depends on the mode your cluster’s kube-proxy daemons are running in, but this, too, would exceed the scope of this blog post – for now, it’s sufficient to know the mode enabling random Pod selection is the default.
This is also a good opportunity to point out that only ready Pods will receive traffic, i.e., Pods that Kubernetes considers ready in terms of their readiness checks (that’s the default behavior, anyway, which you can override with the
publishNotReadyAddresses property in the Service spec). Since non-ready Pods will not be subject to traffic forwarding, it is important to specify your readiness checks correctly.
Identifying Target Pods
With the Pods running smoothly and a Service available to make queries against them, let’s see how a Service finds its target Pods.
In the previous blog post on labels, we’ve learned that labels are used to identify and find things in Kubernetes while maintaining loose coupling between the components that need to relate to each other, and we’ve shortly glossed over the
spec.selector property above. Let’s see how the Service’s label selector and the Pods’ labels play together.
What if you wanted to find out the labels of the sample Pods launched earlier?
$ k -n workload-reachability-example get pod --show-labels NAME READY STATUS RESTARTS AGE LABELS hello-app-9c76bdcbf-wr5v7 1/1 Running 0 15m app=hello-app,pod-template-hash=9c76bdcbf hello-app-9c76bdcbf-vdntn 1/1 Running 0 15m app=hello-app,pod-template-hash=9c76bdcbf hello-app-9c76bdcbf-xzmd4 1/1 Running 0 15m app=hello-app,pod-template-hash=9c76bdcbf
The only user-defined label those Pods carry is
pod-template-hash label is automatically applied by the Deployment controller). With this information in mind, what if you wanted to query each Pod’s name along with its IP?
$ k -n workload-reachability-example get pod \ --selector "app=hello-app" \ --output custom-columns=NAME:.metadata.name,IP:.status.podIP NAME IP hello-app-9c76bdcbf-wr5v7 10.42.0.53 hello-app-9c76bdcbf-vdntn 10.42.0.51 hello-app-9c76bdcbf-xzmd4 10.42.0.52
You now know all IPs of Pods running the desired workload and could, for example, implement some load-balancing across those IPs.
Although very simple, the above example illustrates the basic principle of how a Service identifies all Pods it should forward traffic to, namely, by means of the aforementioned label selector. A Service’s label selector identifies the logical set of Pods this Service should send traffic to (one could say, then, that a Service constitutes a named label selector).
The Endpoints Object
A Buddy For Each Service, Unless…
Labels are used to identify the Pods the Service object is supposed to send traffic to, but there also needs to be a mechanism to keep track of the Pods’ IPs – in the dynamic world of a Kubernetes cluster, Pods can come and go quickly and frequently, and if that happens, the Service needs to be aware of this changed set of IPs. This is accomplished by means of the so-called Endpoints object.
Let’s query all Endpoints in the
$ k -n workload-reachability-example get endpoints NAME ENDPOINTS AGE hello-service 10.42.0.51:8080,10.42.0.52:8080,10.42.0.53:8080 32m
Huh, that’s odd! There is no definition of an Endpoints object in the Yaml manifest, so why is there one present in the namespace? Turns out Kubernetes itself creates an accompanying Endpoints object for every Service. Like all good rules, this one, too, has exceptions, though:
- The Service does not define a label selector (in this case there are no Pods that could be automatically selected, thus nothing to be put into the Endpoints object)
- The Service is of type ExternalName (as you’ll see here, the ExternalName type simply proxies to the given DNS name, so again, there are no Pods to be selected and hence nothing to be put into the Endpoints object)
If either of the above applies, no Endpoints object will be created.
An Endpoints Object Live
In the case of our example, the Endpoints object was automatically created, and it holds three IPs, which makes sense since the Service’s label selector,
app: hello-app, matches all three Pods spawned by the Deployment. Let’s play around a bit and see what happens to the list of IPs the Endpoints object holds.
In one terminal, issue a command to watch the
hello-service Endpoints object and let it run. It will show all three IPs at first, so no surprises there:
k -n workload-reachability-example get endpoints hello-service --watch NAME ENDPOINTS AGE hello-service 10.42.0.60:8080,10.42.0.61:8080,10.42.0.63:8080 4m31s
In another terminal, let’s employ our little endless curl command to view the Pods’ responses when the Service is queried:
while true; do curl 10.211.55.6:8080; sleep 2; done Hello from Pod 10.42.0.63! Hello from Pod 10.42.0.61! Hello from Pod 10.42.0.63! ...
Now, go ahead and change the label in one of the Pods from
app: hello-service to
app: bye-service. In your first terminal, you’ll see how the Endpoints object’s list of IPs changes:
$ k -n workload-reachability-example get endpoints hello-service --watch NAME ENDPOINTS AGE hello-service 10.42.0.60:8080,10.42.0.61:8080,10.42.0.63:8080 4m31s hello-service 10.42.0.61:8080,10.42.0.63:8080 5m2s hello-service 10.42.0.61:8080,10.42.0.63:8080,10.42.0.64:8080 5m5s
The first change – number of IPs decreasing to two – is because the Pod whose labels you just edited does not match the Service’s label selector anymore, so it’s removed from the list of IPs. The second change is something I’d like to leave to you as a little exercise, but here’s a hot hint: The Deployment object finds the Pods it is responsible for via labels, too! (Strictly speaking, it’s a ReplicaSet identifying the Pods and the Deployment identifying the ReplicaSet, but we’ll talk about Deployments and ReplicaSet in a future blog post.)
Once the number of IPs is back up to three, the output in the terminal running the
curl command should show responses from the new Pod:
Hello from Pod 10.42.0.61! Hello from Pod 10.42.0.61! Hello from Pod 10.42.0.64! Hello from Pod 10.42.0.63! Hello from Pod 10.42.0.61! Hello from Pod 10.42.0.63! Hello from Pod 10.42.0.61! Hello from Pod 10.42.0.61!
You can delete all example objects used in scope of this blog post by issuing the following command:
$ k delete -f https://raw.githubusercontent.com/AntsInMyEy3sJohnson/blog-examples/master/kubernetes/workload-reachability/simple-deployment-with-service.yaml
Services provide powerful decoupling between a set of Pods constituting a workload and those wishing to consume it. This is possible because a Service receives a virtual, thus stable IP, the cluster IP, along with a DNS name (the Service name itself) pointing to it (headless Services excluded). Thus, clients can reach workload Pods by using the Service name, and the Service will then load-balance traffic across all Pods. The availability of a stable IP and a name to refer to it is tremendously important in Kubernetes – as a dynamic system treating (stateless) Pods as ephemeral units, a Pod’s IP should not be relied upon for communication.
Services identify the Pods they should send traffic to by means of a label selector. For a Pod to be included in the Service’s load balancing, all of its labels must match the Service’s selector. The Service will load-balance traffic to all Pods that are ready, i.e. that have reached readiness state. Pods that are not ready will not receive traffic.
The way Kubernetes keeps track of Pod IPs is by means of an Endpoints object it automatically creates for each Service (unless there are no Pods whose IPs could be put into the object, which can be the case for Services without label selectors and Services of type ExternalName). The Endpoints object stores the IPs of all (ready) Pods matching the Service’s label selector, and traffic will be sent to those IPs.
There are different types of Services for different use cases, and this blog post introduces you to those different types of Services and when to use which.