Kubernetes storage on vSphere through TKGI VMware Cloud Provider

Kristof Van Sever, senior DevOps evangelist at Galagio, shares his insights about Kubernetes storage on vSphere through TKGI VMware Cloud Provider.

Share This Post

Share on facebook
Share on linkedin
Share on twitter
Share on email

Kubernetes enables us to store files in Persistent Volumes so we can keep them during container restarts. In this blog post we look at how VMware Tanzu Kubernetes Grid Integrated Edition (TKGI) enables us to use the TKGI vSphere volume provisioner to back these Persistent Volumes by VMDKs.

When using Kubernetes as a platform, workloads run in containers on a container runtime. Kubernetes manages and orchestrates these containers through the concept of Pods. These can be seen as a deployable unit of computation. They typically include a single container, along with possible helper containers to enable you to run your workloads on a platform built onto Kubernetes. These Pods should be considered to be of a transient and disposable nature. They are not like pets that we keep around, but are more like worker ants. They are created to do their job and then destroyed when their services are no longer necessary because an application has been scaled down again.

This brings us to an interesting question – how do we deal with storing data? If we store data inside of our Pod’s container, it can be destroyed automatically when Kubernetes notices that an application should downscale again. Kubernetes offers a solution to this problem in the form of Persistent Volumes and Persistent Volume Claims

These Persistent Volumes can be created in two ways – either they are provided by the Kubernetes administrator, or are created in a dynamic way through a storage class. The way their data is actually stored behind the scenes is heavily dependent on the underlying platform your Kubernetes is running on. For the people using the Kubernetes cluster, this should not really matter as the platform makes this abstract for them and is managed by the platform administrator. 

Persistent Volumes in TKGI

When creating and running Kubernetes clusters through TKGI, storage is provided through the vSphere volume provisioner. This allows a Kubernetes administrator to tell its Kubernetes cluster that the backup files of the created Persistent Volumes will be managed by vSphere. This can be through several options, like VMDK files created on vSanNetwork File Share (NFS) or VMFS over Internet Small Computer Systems Interface (iSCSI) or fiber channel (FC) datastores.

Dynamic provisioning of Persistent Volumes on vSphere

One of the labs we use for technological experiments at Galagio is driven by TKGi to build and run our Kubernetes clusters. These clusters are configured to use dynamic provisioning of our Persistent Volumes. This means that when we require a Persistent Volume for an application, it can be provided to us by the underlying platform instead of being manually provided by an operator. However it’s still possible to provision them manually if wanted.

In order to be able to dynamically provision Persistent Volumes, Kubernetes supports the concept of Storage Classes. They contain information like the provisioner that needs to be used behind the screens, a number of parameters that are used to drive the storage and the reclaim Policy which will define what should happen when the claim on a provided Persistent Volume is released. 

When creating a new Persistent Volume Claim, we can refer to one of the provided Storage Classes through the Storage Class Name property. This will make sure that when we save our new Persistent Volume Claim on our Kubernetes cluster, an appropriately sized Persistent Volume will be created for us – backed by a VMDK file in our vSphere’s storage solution.

Source: https://vmware.github.io/vsphere-storage-for-kubernetes/documentation/overview.html

Let’s look at an example of how we would apply this within the clusters of our TKGi lab. To get a Storage Class that will be backed by vSphere, we apply the following YAML through kubectl, notice the VMWare Cloud Provider we have been discussing. We can also specify the diskformat and datastore that should be used. Finally we also set this Storage Class as the default within our cluster through the storageclass.kubernetes.io/is-default-class annotation .

kind:   StorageClass
apiVersion:   storage.k8s.io/v1
      name: thin-disk

            storageclass.kubernetes.io/is-default-class: "true"
provisioner: kubernetes.io/vsphere-volume
      datastore: tintri-ePact
reclaimPolicy: Retain

If we create a new PersistentVolumeClaim now and a Pod is then mounting it to its container, as in the following sample, we can refer to our StorageClass from the PersistentVolumeClaim YAML (or just rely on it as the default). The pod will mount the created PersistentVolume within its container on the /var/foo path. 

We can then use the mounted PersistentVolume within our pod. In our example, we use a command to write some text to a new myfile.txt file in the PersistentVolume. After this Pod finishes its job and would get removed, the PersistentVolume with the myfile.txt will still exist. This enables another Pod to mount it again later on.

apiVersion: v1
kind: PersistentVolumeClaim
  name: myclaim
    - ReadWriteOnce
  volumeMode: Filesystem
      storage: 8Gi
  storageClassName: thin-disk
apiVersion: v1
kind: Pod
  name: pvc-demo
  namespace: default
  restartPolicy: Never
    - name: myvolume
        claimName: myclaim
  - name: pvc-demo
    image: "k8s.gcr.io/busybox"
    command: ["/bin/sh", “-c”, "echo \"Some text.\" > /var/foo/myfile.txt"]
    - name: myvolume
      mountPath: /var/foo

If we get a list of PersistentVolumeClaims, we can see that our newly created claim is bound and if we get a list of PersistentVolumes we can see that the PersistentVolume is bound too.

$kubectl get pvc,pv
NAME                            STATUS   VOLUME                              CAPACITY   ACCESS MODES   STORAGECLASS   AGE
persistentvolumeclaim/myclaim   Bound   
pvc-d4638e19-6ded-4977-9814-89876b1dcb3b   8Gi        RWO           
thin-disk      2m21s

NAME                                                        CAPACITY  

persistentvolume/pvc-d4638e19-6ded-4977-9814-89876b1dcb3b   8Gi        RWO
Retain           Bound    default/myclaim   thin-disk               2m20s
$kubectl describe persistentvolume/pvc-d4638e19-6ded-4977-9814-89876b1dcb3b
Name:            pvc-d4638e19-6ded-4977-9814-89876b1dcb3b
Labels:          <none>
Annotations:     kube...io/createdby: vsphere-volume-dynamic-provisioner
                pv.kuber...io/bound-by-controller: yes
                pv.kuber...io/provisioned-by: kubernetes.io/vsphere-volume
Finalizers:      [kubernetes.io/pv-protection]
StorageClass:    thin-disk
Status:          Bound
Claim:           default/myclaim
Reclaim Policy:  Retain
Access Modes:    RWO
VolumeMode:      Filesystem
Capacity:        8Gi
Node Affinity:   <none>
    Type:               vSphereVolume (a Persistent Disk resource in vSphere)
    VolumePath:         [tintri-ePact] kubevols/cluster02-dynamic-pvc-d4638e19-6ded-4977-9814-89876b1dcb3b.vmdk
    FSType:             ext4
Events:                 <none>

Finally, let’s take a look at our PersistentVolume’s details. We can see the name of the file that’s backing it and if we look it up in the vSphere client, we can see that this file is managed by our vSphere’s storage provider.


As we have seen in our example, TKGi enables the automatic provisioning of VMDK files in an underlying vSphere storage provider. This empowers developers to provide themselves the required disk space, within the limits set up by the cluster’s administrators. Next to that, it gives the vSphere infrastructure admins the possibility to manage the disk space within the vSphere tooling they are used to.

Galagio :

Kristof is a senior DevOps evangelist at Galagio. GALAGIO empowers business & IT in creating great customer based software from idea to production by helping you bridge the gap between Dev & Ops. Giving developers the power so they can create/provision software fast and with confidence while allowing your Ops to manage both your on-prem and/or public Cloud environments in a uniform way, enforcing standards & security company wide. 

More To Explore


Tanzu Build Service to overcome Log4j CVE-2021-44228

The recent Log4j Security vulnerability (CVE-2021-44228) made it once again very clear that robust and fast handling in your development lifecycle is crucial. Because log4j is so widespread and intertwined with many applications it’s a major security risk that should have been fixed yesterday.

Do You Want To Boost Your Business?

drop us a line and keep in touch