🌐 Basic Networking
Pods are both ephemeral and "mortal", they should be considered effectively transient. Kubernetes can terminate and reschedule pods for a whole range of reasons, including rolling updates, hitting resource limits, scaling up & down and other cluster operations. With Pods being transient, you can not build a reliable architecture through addressing Pods directly (e.g. by name or IP address).
Kubernetes solves this with Services, which act as a network abstraction over a group of pods, and have their own independent and more stable life cycle. We can use them to greatly improve what we've deployed.
🧩 Deploy PostgreSQL Service
Now to put a Service in front of the PostgreSQL pod, if you want to create the service YAML yourself, you can refer to the Kubernetes docs:
- The type of Service should be
ClusterIP
which means it's internal to the cluster only - The service port should be 5432.
- The target port should be 5432.
- Selector decides what pods are behind the service, in this case use the label
app
and the valuepostgres
. It is crucial that this matches the label on the pods,otherwise the service will not be able to route traffic to them.
Labels are optional metadata that can be added to any object in Kubernetes, they are simply key-value pairs. Labels can be used to organize and to select subsets of objects. The label "app" is commonly used, but has no special meaning, and isn't used by Kubernetes in any way
Save your YAML into a file postgres-service.yaml
or use the below YAML manifest for the service:
Click here for the PostgreSQL service YAML
kind: Service
apiVersion: v1
metadata:
# We purposefully pick a different name for the service from the deployment
name: database
spec:
type: ClusterIP
selector:
app: postgres
ports:
- protocol: TCP
port: 5432
targetPort: 5432
Apply it to the cluster as before:
kubectl apply -f postgres-service.yaml
You can use kubectl
to examine the status of the Service just like you can with Pods and Deployments:
# Get all services
kubectl get svc
# Get details of a single service
kubectl describe svc {service-name}
You might spot a service called 'kubernetes', which exists in the default namespace and is placed there automatically, despite sounding very important you can completely ignore it.
🛑 EXTRA NOTE: As mentioned in part 3, there are varied opinions on running databases inside Kubernetes. As a general rule it would be advised to use a hosted cloud database service which resides outside your cluster and can be managed independently. However we will continue with PostgreSQL running in the cluster in the interests of learning.
📡 Connect the API to the PostgreSQL Service
Now we have a Service in our cluster for PostgreSQL we can access the database using DNS rather than IP, and if the pod(s) die, restart or move; this name remains constant. DNS with Kubernetes is a complex topic we won't get into here, the main takeaway for now is:
- Every Service in the cluster can be resolved over DNS.
- Within a Namespace, the Service name will resolve as a simple hostname, without the need for a DNS suffix.
However, if you want to be more explicit you can use the full DNS name which is:
{service-name}.{namespace}.svc.cluster.local
📚 Kubernetes Docs: DNS for Services and Pods
Edit the the api-deployment.yaml
file you created previously and change the value of the POSTGRES_DSN
environmental
variable. Replace the IP address with name of the service, e.g. the connection string should start with host=database
.
You can update the active deployment with these changes by re-running kubectl apply -f api-deployment.yaml
. Kubernetes
will perform a rolling update, if you are quick and run kubectl get pods
you might see it taking place, i.e. a new pod
starting & the old one terminating. Again you can check the status and the logs using kubectl
.
🌍 Expose the Data API externally
We can create a different type of Service in front of the data API, in order to expose it outside of the cluster and
also to the internet. To do this use a Service with the type LoadBalancer
, this will be picked up by Azure and a
public IP assigned and traffic routed through an Azure LoadBalancer in front of the cluster. How this happens is well
outside of the scope of this workshop. Once again the selector is very important, it should match the label on the pods,
which in this case is app: nanomon-api
, without this the service can not know which pods it should route traffic to.
We can also change the port at the Service level, so the port exposed by the Service doesn't need to match the one that the container is listening on. In this case we'll re-map the port to 80.
Save your YAML into a file api-service.yaml
from above or below.
Click here for the data API service YAML
kind: Service
apiVersion: v1
metadata:
name: api
spec:
type: LoadBalancer
selector:
app: nanomon-api
ports:
- protocol: TCP
port: 80
targetPort: 8000
Apply it to the cluster with the now familiar kuectl apply
command:
kubectl apply -f api-service.yaml
Using kubectl get svc
check the status and wait for the external IP to be assigned, which might take a minute or two.
Then go to the address in your browser http://{EXTERNAL_IP}/api/status
and you should get the same JSON response as
before.
Clearly this is better than what we had before, but in production you would never expose plain HTTP traffic directly into your pods like this! Later we can improve this yet further, but for now it will suffice.
🖼️ Cluster & Architecture Diagram
The resources deployed into the cluster & in Azure at this stage can be visualized as follows:
As you can see the API Service is exposed externally, and routes traffic to the two API Pods. The API Pods in turn access the PostgreSQL Service which exposes traffic to the single PostgreSQL Pod using a stable DNS name.