DRS is not only a critical feature in vSphere, but also a critical IDEA for virtualization and cloud models (private or public). The idea is basic:
Virtualization encapsulates compute, and vMotion liberates those encapsulated objects, but VMware Distributed Resource Scheduler (DRS) is the thing that actually turns the cluster of servers in a dynamic pool of resources.
Not only is this important for efficiency, but prioritization and QoS are critical as everyone starts to virtualize things that come with SLA requirements.
So – there are three pillars of any compute model: compute, memory and IO. Until recently, DRS applied policy only to CPU and memory – applying several core principles:
- hypervisor scheduler pCPU prioritization (via shares/limits/reservations)
- memory oversubscription management (via shares/limits/reservations)
- cluster load distribution (via policy-driven vmotion)
This translated to actions that applied just on the ESX/ESXi host (the first two in the list above which are focused on throttling), but also across the shared resource - the vSphere cluster (via the third in the list above which focuses on redistribution of workload for optimization as the workloads change).
There are two critical values that customers get from the idea of DRS in my opinion:
- As people started to absorb “handing over the keys” to DRS in fully automated mode, it let them start to focus on more important things.
- It lets the whole infrastructure be more efficient. I’ve seen differing results, but it’s not uncommon for you to fit 30-40% more workload onto a given vSphere cluster by using DRS.
Those two pieces were hard to match in storage – until now. Read on for detail and demo… Pretty amazing.
In vSphere 4.1, VMware extended the idea of DRS to have early parts that include Network and Storage IO Control. Like with CPU and memory, there are two parts to IO DRS – dealing with instantaneous contention and overall resource optimization. Those, BTW are ideally coupled ideas.
Unlike vMotion which is relatively “lightweight” on resources (i.e. 10 seconds to move a VM is normal, and places little load on the resources other than the vmknic used for vmotion traffic), actually moving non-volatile (disk/solid state) storage around is NOT lightweight. It loads up the storage network heavily (contenting with normal IO – unless your array supports VAAI), but also generates a lot of backend work for the array, moving/copying blocks – whether the vSphere cluster is accessing it via NFS or VMFS. So – moving a VM
Storage IO control lets you apply shares and IOps limits (which governs IO vmkernel admittance) on a virtual disk by virtual disk basis. This provides “instant relief” during periods of contention (based on any guest in the datastore hitting a guest IO threshold), and results in cluster-wide and datastore-wide improved overall IO by not letting a given virtual disk dominate.
EMC sub-LUN FAST (Fully Automated Storage Tiering) provides the second half of the equation – the storage equivalent of DRS policy-controlled vMotion.
Here, we show how a in a vSphere 4.1 environment, SIOC and EMC FAST can be used together to ensure that the right workloads get performance and can automatically tier to deliver higher overall performance for production workloads.
You can download this demo in high resolution WMV, MOV formats.
It’s important to note that we highlighted the configuration of both SIOC and FAST as a manual action just to highlight how they are configured and used. If SIOC was enabled, but there was no configuration, the VMs would have all gotten the same admission priority – in other words the overrun caused by the developer would be mitigated.
Likewise, in the FAST use case, when you create a LUN in a pool with a mixed drive configuration, it will automatically auto-tier if you have FAST without the administrator needing to do anything. We expect that over time, just like people got used to DRS in “partially automated mode” (vMotion activity was recommended, but not automatic) but most customers moved to “fully automated mode” – the same thing will happen with storage automated tiering.
So – there you have it. DRS – for storage, available now with vSphere 4.1 and EMC Unified storage. Simple. Efficient. That’s a beautiful thing :-)
Great example of VMware SIOC and EMC FAST. I was looking for the high res downloads but the links dont seem to be posted :(
Posted by: Steven Nekava | September 01, 2010 at 10:08 AM
So if the SIOC is applied at the VM level how do you place a limit on the vmkernel? By far the heaviest hitter on my san is svmotion, it will generate tens of thousands of IOPS. Since my spindles are all shared I'd rather have slower svmotion and better responsiveness and ideally I could do it before the response time degrades.
Posted by: Andrew Fidel | September 01, 2010 at 10:37 AM
I was under the impression that EMC FAST is not support yet under VS4.1. Is that a correct statement or has that changed?
Posted by: Curtis Weldon | September 02, 2010 at 02:33 PM
@Curtis - that's a common misunderstanding. SIOC is fully supported on arrays that auto-tier (EMC, Dell/EqualLogic, 3PAR, Compellent).
I believe the source of the misunderstanding was early pre-vSphere 4.1 training that incorrectly stated that SIOC wouldn't be supported with Auto-Tiering. This position was corrected before GA based on VMware testing with the vendors that support this storage feature.
In fact, I've heard it so often, that I wrote a blog post about it here: http://virtualgeek.typepad.com/virtual_geek/2010/07/vsphere-41-sioc-and-array-auto-tiering.html
The VMware team has authored a good whitepaper that covers the congestion threshold setting values here: http://www.vmware.com/files/pdf/techpaper/VMW-vSphere41-SIOC.pdf
EMC's recommendation for auto-tiered datastores is to keep the value at the default (30ms)
@Andrew - I hear you. SVMotion can saturate storage networks and storage arrays. Good news is that VAAI resolves the first part of that (offloading the bulk of the network load) when the source and target arrays are the same (but different datastores). We are working on throttling techniques for svmotion.
@steven - the high rez videos are now posted.
Posted by: Chad Sakac | September 11, 2010 at 04:39 PM
Hi,
The high resolution demos are not accessible.
chris
Posted by: christoph Henzen | September 14, 2010 at 02:56 AM
@chris - thanks for noticing! The links should be corrected now... have fun!
Posted by: Chad Sakac | September 14, 2010 at 09:20 PM
Thanks Chad, all ok now.
chris
Posted by: christoph Henzen | September 15, 2010 at 02:18 AM
Hi Chad,
do you have any information about SIOC together with PowerPath/VE. Are there any recomendation's about how to setup or if this is supported at all?
I could find any information in the E-Lab Nav or on powerlink.
thx
Tom
Posted by: Tom | December 10, 2010 at 03:44 AM