With the new awesome thin provisioning GUI and more flexible virtual disk behavior (hallelujah – no more "clone/template=eagerzeroedthick”!) in vSphere, I’m getting more questions re: best practices when you have the choice of doing it at the array level or the VMware layer.
This is covered in chapter 6 of the upcoming Mastering VMware vSphere 4.0 that Scott Lowe is authoring (more here). I’ve guest authored Chapter 6 for Scott. Chapter 6 is entitled – “VMware vSphere 4.0 - Creating And Managing Storage Devices”
Read on for more details – and there’s LOTS more in the book!
Ok – first – some critical understanding:
Virtual Disks come in three formats:
- Thin - in this format, the size of the VDMK file on the datastore is only however much is used within the VM itself. For example, if you create a 500GB virtual disk, and place 100GB of data in it, the VMDK file will be 100GB in size. As I/O occurs in the guest, the vmkernel zeroes out the space needed right before the guest I/O is committed, and growing the VMDK file similarly.
- Thick (otherwise known as zeroedthick) - in this format, the size of the VDMK file on the datastore is the size of the virtual disk that you create, but within the file, it is not “pre zeroed”. For example, if you create a 500GB virtual disk, and place 100GB of data in it, the VMDK will appear to be 500 GB at the datastore filesystem, and contains 100GB of data on disk. As I/O occurs in the guest, the vmkernel zeroes out the space needed right before the guest I/O is committed, but the VDMK file size does not grow (since it was already 500GB)
- Eagerzeroedthick - in this format, the size of the VDMK file on the datastore is the size of the virtual disk that you create, and within the file, it is “pre-zeroed”. For example, if you create a 500GB virtual disk, and place 100GB of data in it, the VMDK will appear to be 500GB at the datastore filesystem, and contains 100GB of data and 400GB of zeros on disk. As I/O occurs in the guest, the vmkernel does not need to zero the blocks prior to the I/O occurring. This results in improved I/O latency, and less back-end storage I/O operations during normal I/O, but significantly more back-end storage I/O operation up front during the creation of the VM.
In VMware Infrastructure 3.5, the CLI tools (service console or RCLI) could be used to configure the virtual disk format to any type, but when created via the GUI, certain configurations were the default (with no GUI option to change the type)
- On VMFS datastores, new virtual disks defaulted to Thick (zeroedthick)
- On NFS datastores, new virtual disks defaulted to Thin
- Deploying a VM from a template defaulted to eagerzeroedthick format
- Cloning a VM defaulted to an eagerzeroedthick format
This is why the creation of a new virtual disk has always been very fast, but in VMware Infrastructure 3.x cloning a VM or deploying a VM from a template (even with virtual disks that are nearly empty) took much longer.
Also, storage array-level thin-provisioning mechanisms work well with Thin and Thick formats, but not with the eagerzeroedthick format (since all the blocks are zeroed in advance) - so potential storage savings of storage-array level thin provisioning were lost as virtual machines were cloned or deployed from templates.
Also – BTW – if you have TP at the array level and are using EITHER NFS or VMFS, that clone/template behavior is also why you can save a lot of storage $$ by going to vSphere.
The Virtual Disk behavior in vSphere has changed substantially, resulting in significantly improved storage efficiency - most customer can reasonably expect up to a 50% higher storage efficiency than with ESX/ESXi 3.5, across all storage types.
- The Virtual Disk format selection is available in the creation GUI
- vSphere still uses a default format of Thick (zeroedthick), but in the virtual disk creation dialog, there’s a simple radio button to thin-provision the virtual disk (if your block storage array doesn’t support array-level thin provisioning).
- Also note that there is a radio button to use Fault Tolerance, which employs the eagerzeroedthick format on VMFS volumes.
Above is the new virtual disk configuration wizard. Note that in vSphere 4 the virtual disk type can be easily selected via the GUI, including thin provisioning across all array and datastore types. Selecting the “Support Clustering features such as Fault Tolerance” creates an eagerzeroedthick virtual disk on VMFS datastores.
Clone/Deploy from Template operations no longer always use the eagerzeroed thick format, but rather when you clone a VM or deploy from a template, this dialog box enables you to select the destination type (defaults to the same type as the source).
Also, the virtual disk format can be easily changed from thin to eagerzeroedthick. It can be done via the GUI, but not in a “natural” location (which would be in the Virtual Machine settings screen). If you navigate in the datastore browser to a given virtual disk and right click you see a GUI option as noted below.
You cannot “shrink” a thick or eagerzeroedthick disk to thin format directly through the virtual machine settings in the vSphere client, but this can be accomplished non-disruptively via the new storage vmotion (allowing VI3.x customers to reclaim a LOT of space).
The eagerzeroedthick virtual disk format is required for VMware Fault Tolerant VMs on VMFS (if they are thin, conversion occurs automatically as the VMware Fault Tolerant feature is enabled). It continues to also be mandatory for Microsoft clusters (refer to KB article) and recommended in the highest I/O workload Virtual Machines, where the slight latency and additional I/O created by the “zeroing” that occurs as part and parcel of virtual machine I/O to new blocks is unacceptable. From a performance standpoint, the differences between thick and pre-zeroed for I/Os to blocks that have already been written to perform identically - within the error of margin of the test.
So… What’s right - thin provisioning at the VMware layer or the storage layer? The general answer is that is BOTH.
If your array supports thin provisioning, you’ll generally get more efficiency using the array-level thin provisioning in most operational models.
- If you thick provision at the LUN or filesystem level, there will always be large amounts of unused space until you start to get it highly utilized - unless you start small and keep extending the datastore - which operationally is heavyweight, and general a PITA.
- when you use thin provisioning techniques at the array level using NFS or VMFS and block storage you always benefit. In vSphere all the default virtual disk types - both Thin and Thick (with the exception of eagerzeroedthick) are “storage thin provisioning friendly” (since they don’t “pre-zero” the files). Deploying from templates and cloning VMs also use Thin and Thick (but not eagerzeroedthick as was the case in prior versions).
- Thin provisioning also tends to be more efficient the larger the scale of the “thin pool” (i.e. the more oversubscribed objects) - and on an array, this construct (every vendor calls them something slightly different) tends to be broader than a single datastore - and therefore more efficiency factor tends to be higher.
Obviously if your array (or storage team) doesn’t support thin provisioning at the array level – go to town and use Thin at the VMware layer as much as possible.
What if your array DOES support Thin, and you are using it that way - is there a downside to “Thin on Thin”? Not really, and technically it can be the most efficient configuration – but only if you monitor usage. The only risk with “thin on thin” is that you can have an accelerated “out of space condition”.
An example helps here.
At the VMware level you have 10 VMs, each VM is a 50GB Virtual Disk, and has 10GB of data on it.
- If provisioned as Thick, each is a 50GB file, but only containing 10GB of data. It could never get “bigger” than 50GB without extending it.
- If provisionined as Thin, each is a 10GB file, that can grow to 50GB.
At the Datastore level:
- If you used Thick virtual disks, you would HAVE to have a 500GB (10x50GB) datastore (technically a lot more due to the extra stuff a VM needs, but for the sake of easy math I’m keeping it simple here…) In the Thick case you can’t run out of space at the VMware layer – so you don’t need to monitor that.
- If you used Thin virtual disks, you only needed a 100GB (10x10GB) datastore (more due to the extra stuff a VM needs, but for the sake of easy math…). In the Thin case you CAN run out of space at the VMware layer – so you DO need to monitor that (vSphere adds a simple alert on datastore thresholds).
At the storage layer:
- If you use Thick storage provisioning and Thick VMs, you would need to create a storage object (LUN or Filesystem) that is 500GB in size, though in reality, only 100GB is being used
- If you use Thick storage provisioning and Thin VMs, you would need to create a storage object (LUN or Filesystem) that is 100GB in size, but you HAD BETTER MONITOR IT and be ready to expand – as it will grow to up to 500GB in size.
- If you use Thin storage provisioning and Thick VMs, you would need to create a storage object (LUN or Filesystem) that is 500GB in size, but it would only consume 100GB. You wouldn’t need to monitor the LUN/filesystem, but instead the pool itself (because there isn’t actually 500GB available), and you could be more efficient.
- If you use Thin storage provisioning and Thin VMs, you would need to create a storage object (LUN or Filesystem) that is 100GB in size, but you should ACTUALLY configure a storage object (LUN or filesystem) that is 500GB, as in either case, it would only consume 100GB – but by using a larger storage object, you don’t need to monitor it at the VMware layer as closely. You wouldn’t need to monitor the LUN/filesystem, but instead the pool itself (because there isn’t actually 500GB available), and you could be more efficient.
If you look at that, you can see that picking:
- 1 at the VMware layer (thick VMs) + 3 at the storage layer (thin storage) and 2 at the VMware layer (thin VMs) + 4 at the storage layer (thin storage) are operationally the same thing, and have the same space efficiency.
- 2 at the VMware layer (thick VMs) and + 4 at the storage layer (thin storage) has less management complexity (you monitor at one layer, not both)
If you DO use Thin on Thin, use VMware or 3rd party usage reports in conjunction with array-level reports, and set thresholds with notification and automated action on both the VMware layer (and the array level (if you array supports that). Why? Thin provisioning needs to carefully manage for “out of space” conditions, since you are oversubscribing an asset which has no backdoor (unlike how VMware oversubscribes guest memory which can use VM swap if needed). When you use Thin on Thin - this can be very efficient, but can “accelerate” the transition to oversubscription.
BTW – this is a great use of the new managed datastore alerts. Just set the alert thresholds below the array-level TP (and if your array supports auto-grow and notification, configure it to auto-grow to the maximum datastore size – BTW – all modern EMC arrays auto-grow and notify). Also, for EMC customers, use the vCenter plugins or Control Center/Storage Scope (which accurately show VP state and use) to monitor and alert at the array level.
In the next minor release of vSphere, one of the areas of ongoing work in the vstorage API stack is around thin provisioning integration which means the reports (the actual array-level details) will also be directly in vCenter (the vCenter 4.0 reports already show the VMware-level provisioned vs actual usage), in which case the management overhead gets less, and we manage to squeeze out even more.
There are only two exception to the “always thin provision at the array level if you can” guideline. The first is in the most extreme performance use cases, as the thin-provisioning architectures generally have a performance impact (usually marginal) vs. a traditional thick storage configuration. UPDATED (5/1/2009, 1:32pm EST – good feedback in the comments – see below - suggested this case only applies when on local storage. So, when you couple this with the caveat in the first case – that performance impact is marginal, and heck, there are benefits to the wide-striped approach of TP – there’s almost no reason not to use array TP)
The second are large, high performance database storage objects (which have internal logic that generally expect “IO locality” - which is a fancy way of saying that they structure data expecting the on-disk structure to reflect their internal structure.