Monitoring is a key discipline for everyone running Kubernetes clusters in production or similar environments. VMware PKS delivers out of the box integrations for logging, application monitoring, and infrastructure monitoring to satisfy requirements of different personas working with the platform.
The requirements of operations teams or Platform Reliability Engineers tend to be more infrastructure-oriented. Areas such as capacity management, performance management, and health monitoring of all components in a service chain are highly important to ensure OLAs/SLAs are met.
In this blog post, I want to demonstrate how to leverage the vRealize Operations Manager (vROPs) integration to monitor of PKS managed K8s clusters.
First of all, we need to make sure we have everything in place to establish the integration between the different components. In the following scenario, we have a PKS 1.3 managed K8s cluster and vRealize Operations Manager version 7.0 already up and running.
➜ ~ pks cluster k8scl01 Name: k8scl01 Plan Name: small UUID: 557cffde-3647-4267-a50f-fa3e09a39608 Last Action: CREATE Last Action State: succeeded Last Action Description: Instance provisioning completed Kubernetes Master Host: pkscl01.aulab.local Kubernetes Master Port: 8443 Worker Nodes: 2 Kubernetes Master IP(s): 172.16.10.1 Network Profile Name:
To monitor our K8s cluster, we need to download and install the “vRealize Operations Management Pack for Container Monitoring” from VMware’s Solution Exchange. Have a look at the Technical Specifications to ensure you have the right vROPs version (6.6.x and above) as well as the right VMware PKS version in place (1.1 and above).
Deploy cAdvisor DaemonSet
As a prerequisite of the Management Pack for Container Monitoring, we need to deploy cAdvisor as a DaemonSet on our Kubernetes cluster. The instructions and the necessary yaml definition can be found in the User Guide, here.
Simply copy the following code or the content from the User Guide into a yaml file (e.g. vrops-cadvisor.yaml).
apiVersion: apps/v1beta2 # apps/v1beta2 in Kube 1.8, extensions/v1beta1 in Kube < 1.8 kind: DaemonSet metadata: name: vrops-cadvisor namespace: kube-system labels: app: vrops-cadvisor spec: selector: matchLabels: name: vrops-cadvisor template: metadata: labels: name: vrops-cadvisor version: v0.31.0 spec: tolerations: - key: node-role.kubernetes.io/master effect: NoSchedule hostNetwork: true containers: - name: vrops-cadvisor image: google/cadvisor:v0.31.0 imagePullPolicy: Always volumeMounts: - name: rootfs mountPath: /rootfs readOnly: true - name: var-run mountPath: /var/run readOnly: false - name: sys mountPath: /sys readOnly: true - name: docker mountPath: /var/lib/docker #Mouting Docker volume readOnly: true - name: docker-sock mountPath: /var/run/docker.sock readOnly: true - name: containerd-sock mountPath: /var/run/containerd.sock readOnly: true - name: disk mountPath: /dev/disk readOnly: true ports: - name: http containerPort: 31194 #Port exposed hostPort: 31194 #Host's port - Port to expose your cAdvisor DaemonSet on each node protocol: TCP securityContext: capabilities: drop: - ALL add: - NET_BIND_SERVICE args: - --port=31194 - --profiling - --housekeeping_interval=1s terminationGracePeriodSeconds: 30 volumes: - name: rootfs hostPath: path: / - name: var-run hostPath: path: /var/run - name: sys hostPath: path: /sys - name: docker hostPath: path: /var/vcap/store/docker/docker #Docker path in Host System - name: docker-sock hostPath: path: /var/vcap/sys/run/docker/docker.sock - name: containerd-sock hostPath: path: /var/run/docker/containerd/docker-containerd.sock - name: disk hostPath: path: /dev/disk
Create the DaemonSet with “kubectl create -f vrops-cadvisor.yaml“.
➜ ~ kubectl create -f vrops-cadvisor.yaml daemonset.apps/vrops-cadvisor created ➜ ~ kubectl get pods --all-namespaces NAMESPACE NAME READY STATUS RESTARTS AGE default redis-server-77b4d88467-wc956 1/1 Running 0 24h default yelb-appserver-58db84c875-bgncm 1/1 Running 0 24h default yelb-db-69b5c4dc8b-zvhl2 1/1 Running 0 24h default yelb-ui-6b5d855894-v985g 1/1 Running 0 24h kube-system heapster-85647cf566-tnzkd 1/1 Running 0 3d16h kube-system kube-dns-7559c96fc4-lkw2n 3/3 Running 0 3d16h kube-system kubernetes-dashboard-5f4b59b97f-6dmpt 1/1 Running 0 3d16h kube-system metrics-server-555d98886f-rtfc9 1/1 Running 0 3d16h kube-system monitoring-influxdb-cdcf4674-27ndm 1/1 Running 0 3d16h kube-system vrops-cadvisor-d4dnm 1/1 Running 0 7s kube-system vrops-cadvisor-p622f 1/1 Running 0 7s pks-system event-controller-6c77ddd949-cszwv 2/2 Running 1 3d16h pks-system fluent-bit-88cxx 2/2 Running 0 3d16h pks-system fluent-bit-p8qf9 2/2 Running 0 3d16h pks-system sink-controller-65595c498b-gr8x4 1/1 Running 0 3d16h pks-system telemetry-agent-559f9c8855-6p2gr 1/1 Running 0 3d16h
To verify the functionality of cAdvisor as part of our Kubernetes cluster, connect to “http://node_ip:31194/containers/” where node_ip is the IP address of your Kubernetes node. Make sure that you can access the cAdvisor webpage and that metrics data is coming in.
Additionally, check if you can access information on the Docker containers via “http://node_ip:31194/docker/“. If the connection to the Docker daemon is working, you should see a list of containers.
Install Management Pack
Now that we have cAdvisor running, let’s install the vRealize Operations Management Pack for Container Monitoring. Log in to vRealize Operations Manager with Admin permissions and go to the “Administration” tab. Click the green + icon under “Solutions” to start the installation wizard.
Select the PAK file that we have downloaded from VMware’s Solution Exchange and click “UPLOAD”.
Follow the self-explanatory wizard until the installation is completed.
Configure Management Pack
As a next step, we need to configure an “Adapter Instance” of the installed Management Pack. Click on the little gear icon to open the configuration page. We can configure multiple adapter instances, one per Kubernetes cluster if required.
To add another adapter instance, click on the green + icon and specify a display name. Enter the Master URL of your cluster, select DaemonSet as cAdvisor Service and specify the cAdvisor port from the yaml definition that we have used earlier to create the cAdvisor DaemonSet (31194 in our case). Before we can test the connection, we need to add valid credentials for our Kubernetes cluster. Click on the green + icon next to the credential field.
We are simply going to use the token from our local kubectl config file in this scenario. The config file can be found under “$home/.kube/” or you type in “kubectl config view” while on the correct kubectl config context. Alternatively, you can choose Basic or Client Certificate Authentication.
Specify the credential type, a display name, the Bearer Token value and click “OK”.
Let’s see if the connection is working, click on “TEST CONNECTION”. If the connection was established successfully, you should see a message like this.
Save the adapter instance settings by clicking on “SAVE SETTINGS”.
Done, we have successfully integrated vRealize Operations Manager with our VMware PKS managed Kubernetes cluster. We can now have a look at the Kubernetes related information available in vROPs.
The “Kubernetes Overview” dashboard is now available under the “Dashboards” tab. The dashboard shows a lot of useful information about your K8s clusters, nodes, pods, and containers.
First, select the K8s cluster you want to view under point 1. Immediately, you will see a lot of useful information, such as a summary of the K8s cluster objects (nodes, namespaces, pods, containers, …) and the corresponding health status. Widget 3 shows all active alerts and next to it, you can see a health map of related objects.
In my lab, we can see a memory usage alert on both K8s nodes, and therefore the health status is degraded. By clicking on one of the nodes within widget 5, we will get more useful details. Section 7 shows a health map with the pods running on the selected node. We can see that the trend for the “Memory Usage (%)” metric is quite high for some time now, see widget number 8. Right next to it, we can select node metrics we want to add to the metric chart.
Solving such a resource problem is a very easy task with VMware PKS. We can simply scale the Kubernetes cluster and add additional nodes to it. If you want to learn more about scaling a K8s cluster with PKS, have a look at my blog post “VMware PKS 1.3 Scale K8s Clusters“.
Pod & Container Widgets
Further down, we can find information about pods and containers. Select the pod to inspect under widget 11. We will see the related containers and the health status next to it. Widget 13 shows trend lines for Pod metrics and within window 14, we can select metrics to be added to the metric chart.
This is a lot of very useful information in just one dashboard! It helps Operators to quickly identify the root cause and to speed up the troubleshooting process. Additionally, it will monitor the environment and create alerts based on certain Symptoms, see screenshot.
You should also check out the video from VMware’s Michael West on monitoring Kubernetes clusters with vRealize Operations Manager.
Implementing the PKS/vROPs integration via the vRealize Operations Management Pack for Container Monitoring is quite easy if you follow the instructions in this blog post or within the User Guide. I expect that to become more automated in future VMware PKS versions.
The vRealize Operations Management Pack for Container Monitoring allows operations teams and Platform Reliability Engineers to monitor and quickly analyze their K8s clusters. Health-related alerts in combination with the relationship views of native K8s objects, help to reduce the MTTR (mean time to repair) and to keep the environment in a healthy state. Additionally, vROPs can be used to plan and manage the capacity of the underlying infrastructure resources.
If you want to learn more about PKS 1.3, please have a look at my VMware PKS 1.3 What’s New blog post.