VMware PKS 1.3 Scale K8s Clusters
Since the release of VMware PKS 1.3, we can scale up and down Kubernetes clusters. Scaling up was possible since the first PKS release, but scaling down is a new capability of PKS 1.3. This feature gives Platform Reliability Engineers the necessary elasticity and flexibility to manage infrastructure capacity of their Kubernetes clusters. In this post, I want to describe the process of scaling a PKS managed Kubernetes cluster.
Toolset
vSphere HTML5 Client
Before we start, let’s make sure we have the right tools available to execute and monitor the scaling process. First of all, we should open the vSphere HTML5 Client to see the VMs that are forming our PKS Kubernetes cluster.
In my case, I have a Kubernetes cluster with 2 worker nodes deployed, and I want to scale down to 1. We can check the role of the VM by looking at the Custom Attributes “instance_group” and “job”.
BOSH CLI
The next tool we should have available and ready to use is bosh-cli. I am a Mac user, so I decided to use Homebrew to install the latest bosh-cli version.
➜ ~ brew install cloudfoundry/tap/bosh-cli ... ==> Tapping cloudfoundry/tap Cloning into '/usr/local/Homebrew/Library/Taps/cloudfoundry/homebrew-tap'... remote: Enumerating objects: 14, done. remote: Counting objects: 100% (14/14), done. remote: Compressing objects: 100% (13/13), done. remote: Total 14 (delta 1), reused 6 (delta 0), pack-reused 0 Unpacking objects: 100% (14/14), done. Tapped 7 formulae (49 files, 55KB). ==> Installing bosh-cli from cloudfoundry/tap ==> Downloading https://s3.amazonaws.com/bosh-cli-artifacts/bosh-cli-5.4.0-darwin-amd64 ######################################################################## 100.0% ==> Caveats Bash completion has been installed to: /usr/local/etc/bash_completion.d ==> Summary 🍺 /usr/local/Cellar/bosh-cli/5.4.0: 4 files, 27.0MB, built in 14 seconds ➜ ~ bosh -v version 5.4.0-891ff634-2018-11-14T00:21:14Z
But you can also simply download the right version for your OS on GitHub here and follow the instructions here, or use the already installed bosh-cli version on the Cloud Foundry Ops Manager. To SSH into the Ops Manager VM, use the “ubuntu” user and the password that you have specified during the OVA deployment.
To login to your BOSH Director, we need to first copy the root_ca_certificate to the workstation/client from where we want to use the bosh-cli. We can download the certificate from the Ops Manager UI under Settings/Advanced. If you want to execute bosh-cli from the Ops Manager VM, you can find the certificate under the following folder “/var/tempest/workspaces/default/root_ca_certificate”.
As a next step, we need to get the necessary bosh command line credentials via the Ops Manager UI, see screenshot.
You should see an output like this after clicking on “Link to Credential”:
{"credential":"BOSH_CLIENT=ops_manager BOSH_CLIENT_SECRET=ZYxWvutsRqPoNmlKjIhgFeDcBA BOSH_CA_CERT=/var/tempest/workspaces/default/root_ca_certificate BOSH_ENVIRONMENT=192.168.96.1 bosh "}
I recommend setting some variables to avoid the need to specify everything during command execution. Simply create a file and copy/format the collected content in the following way. If you are executing bosh-cli from a client and not from the Ops Manager itself, make sure that you change the BOSH_CA_CERT path to the location of the downloaded certificate.
export BOSH_CLIENT_SECRET=ZYxWvutsRqPoNmlKjIhgFeDcBA export BOSH_CLIENT=ops_manager export BOSH_ENVIRONMENT=192.168.96.1 export BOSH_CA_CERT=/home/aullah/root_ca_certificate
Save the file and “source” it whenever you need to execute bosh-cli commands against the environment. Alternatively, you can add the content to your bash profile to have it available every time you start.
Execute the “bosh vms” command to see if it’s working. If everything is configured correctly, you should see an output like this. Find your Kubernetes deployment and make a note of your deployment ID, see screenshot. We will need the deployment ID later to monitor the scaling process.
PKS CLI and Kubectl
To download and install PKC CLI, simply follow the instructions here. Rename the downloaded PKS CLI file to “pks”, make it executable and move it to the /usr/local/bin/ folder.
➜ ~ mv pks-darwin-amd64-1.3.0-build.126 pks ➜ ~ chmod +x pks ➜ ~ mv pks /usr/local/bin/pks ➜ ~ pks --version PKS CLI version: 1.3.0-build.126
Login to PKS with your username and password.
pks login -a <API> -u <USERNAME> -p <PASSWORD> -k
In addition, we obviously need kubectl to monitor the scheduling and restarts of our Pods during the scale operation. Kubernetes is not automatically balancing the Pods during normal operations, only at the time of creation or when Pods get killed and need to be rescheduled. That’s why we need to have an eye on the scale down operation as Pods will be killed and recreated on the remaining worker nodes. Here you can find more information on how to install kubectl.
Now that we have the vSphere HTML5 Client open, the bosh-cli configured, PKS CLI logged in, and kubectl ready, we can start the scaling operation.
Scaling
Run the “pks cluster <clustername>” command to get some information about the cluster you want to scale down.
➜ ~ pks cluster k8s-cluster-01 Name: k8s-cluster-01 Plan Name: small UUID: d0bb926c-86ab-492a-b4b4-ba0824a8a49f Last Action: UPDATE Last Action State: succeeded Last Action Description: Instance update completed Kubernetes Master Host: pks-cluster-01 Kubernetes Master Port: 8443 Worker Nodes: 2 Kubernetes Master IP(s): 172.16.10.1 Network Profile Name:
In parallel, use kubectl to monitor the running Pods of your Kubernetes cluster. Execute the following command to watch your Pods and the corresponding worker nodes.
➜ ~ kubectl get pods -o wide --watch NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE nginx-85474d599b-clmtm 1/1 Running 0 8d 10.200.33.14 d9daf54a-06ac-4456-996e-970d42971141 <none> nginx-85474d599b-w65tf 1/1 Running 0 8d 10.200.33.9 d9daf54a-06ac-4456-996e-970d42971141 <none> redis-bb7894d65-xhrvm 1/1 Running 0 8d 10.200.33.6 d9daf54a-06ac-4456-996e-970d42971141 <none> redis-bb7894d65-xtl7d 1/1 Running 0 8d 10.200.33.8 d9daf54a-06ac-4456-996e-970d42971141 <none> redis-server-77b4d88467-j4glb 1/1 Running 0 6m31s 10.200.96.7 7f3298b9-a803-4b75-9e14-a67ad2cd1d28 <none> yelb-appserver-58db84c875-gldl9 1/1 Running 0 6m31s 10.200.96.9 7f3298b9-a803-4b75-9e14-a67ad2cd1d28 <none> yelb-db-69b5c4dc8b-78k5p 1/1 Running 0 6m31s 10.200.96.8 7f3298b9-a803-4b75-9e14-a67ad2cd1d28 <none> yelb-ui-6b5d855894-4xjmp 1/1 Running 0 6m31s 10.200.96.6 7f3298b9-a803-4b75-9e14-a67ad2cd1d28 <none>
As a next step, simply execute the “pks resize <clustername> -n x” command with a node count lower or higher than the existing. In my case, I want to scale down from 2 to 1 worker node.
➜ ~ pks resize k8s-cluster-01 -n 1 Are you sure you want to resize cluster k8s-cluster-01 to 1? (y/n): y Use 'pks cluster k8s-cluster-01' to monitor the state of your cluster
To monitor the progress, it is advisable to not only use the “pks cluster” command as shown in the output, instead use the BOSH CLI and execute the following commands.
Previously we have made a note of our deployment ID. We can now execute “bosh tasks -d <deployment_ID>” to see which task is currently being executed, followed by “bosh task <taks_number>” to get more details about the progress. Alternatively, you can execute “bosh tasks -r” to get a list of recent tasks or “bosh task -a” to tail the latest task.
Here you can find some more examples of how to use bosh cli from Denny Zhang.
In vCenter, you will see that the worker node VM got deleted and removed from disk.
In the meantime, the kubectl command should show you a few “Terminating”, “Pending” and “ContainerCreating” outputs. That is expected as we deleted a worker node and Kubernetes needed to reschedule the Pods on the remaining worker node. What matters is that the Pods are in a “Running” state at the end.
yelb-appserver-58db84c875-9jklw 0/1 ContainerCreating 0 3s <none> d9daf54a-06ac-4456-996e-970d42971141 <none> yelb-db-69b5c4dc8b-78k5p 1/1 Terminating 0 10m 10.200.96.8 7f3298b9-a803-4b75-9e14-a67ad2cd1d28 <none> yelb-db-69b5c4dc8b-78k5p 1/1 Terminating 0 10m 10.200.96.8 7f3298b9-a803-4b75-9e14-a67ad2cd1d28 <none> yelb-ui-6b5d855894-4xjmp 1/1 Terminating 0 10m 10.200.96.6 7f3298b9-a803-4b75-9e14-a67ad2cd1d28 <none> yelb-ui-6b5d855894-4xjmp 1/1 Terminating 0 10m 10.200.96.6 7f3298b9-a803-4b75-9e14-a67ad2cd1d28 <none> yelb-db-69b5c4dc8b-kczqx 1/1 Running 0 14s 10.200.33.30 d9daf54a-06ac-4456-996e-970d42971141 <none> yelb-appserver-58db84c875-9jklw 1/1 Running 0 15s 10.200.33.31 d9daf54a-06ac-4456-996e-970d42971141 <none> yelb-ui-6b5d855894-hjxwm 1/1 Running 0 15s 10.200.33.32 d9daf54a-06ac-4456-996e-970d42971141 <none> redis-server-77b4d88467-xfp9t 1/1 Running 0 15s 10.200.33.34 d9daf54a-06ac-4456-996e-970d42971141 <none> ➜ ~ kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE nginx-85474d599b-clmtm 1/1 Running 0 8d 10.200.33.14 d9daf54a-06ac-4456-996e-970d42971141 <none> nginx-85474d599b-w65tf 1/1 Running 0 8d 10.200.33.9 d9daf54a-06ac-4456-996e-970d42971141 <none> redis-bb7894d65-xhrvm 1/1 Running 0 8d 10.200.33.6 d9daf54a-06ac-4456-996e-970d42971141 <none> redis-bb7894d65-xtl7d 1/1 Running 0 8d 10.200.33.8 d9daf54a-06ac-4456-996e-970d42971141 <none> redis-server-77b4d88467-xfp9t 1/1 Running 0 15m 10.200.33.34 d9daf54a-06ac-4456-996e-970d42971141 <none> yelb-appserver-58db84c875-9jklw 1/1 Running 0 15m 10.200.33.31 d9daf54a-06ac-4456-996e-970d42971141 <none> yelb-db-69b5c4dc8b-kczqx 1/1 Running 0 15m 10.200.33.30 d9daf54a-06ac-4456-996e-970d42971141 <none> yelb-ui-6b5d855894-hjxwm 1/1 Running 0 15m 10.200.33.32 d9daf54a-06ac-4456-996e-970d42971141 <none>
Done, the scale down operation finished successfully and I have freed up some infrastructure resources. The same process can be used to scale up and add additional worker nodes to a PKS managed Kubernetes cluster.
Conclusion
Scaling PKS managed Kubernetes clusters is a very important capability as it allows for efficient resource utilization by giving back unused resources or to quickly expand if the resource demand is growing.
The scaling operation itself is very easy to execute and to monitor if you have the right toolset in place.
Since VMware PKS 1.3, Platform Reliability Engineers can also make use of the scale down capability and manage infrastructure capacity for their Kubernetes clusters more efficiently.
If you want to learn more about PKS 1.3, have a look at my VMware PKS 1.3 What’s New blog post.
Additional Sources
- Scaling Existing Clusters
- Managing PKS deployments with BOSH
- Bosh-cli on GitHub
- Install PKS CLI documentation
- Bosh-cli cheatsheet from Denny Zhang
- Homebrew Paketmanager for Mac
Categories
3 thoughts on “VMware PKS 1.3 Scale K8s Clusters” Leave a comment ›