All – as everyone is realizing, while “VAAI support” is a tickbox for storage vendors – vendor implementations vary wildly, and that means different things all over the place.
I’ve done a series of webcasts that dive deep into the VNX VAAI implementations (here, here, here), stuff on the VAAI TP reclaim (here), VMAX and VAAI (here) and updates on the NFS VAAI assists (here).
Some of those articles are now obsolete (this remains a difficult topic – how to handle older posts?) – but highlight how dynamic this space is. If you’re a storage platform expert – it’s something to stay on top of… And, today, here’s an update on VMAX and VAAI XCOPY!
For detail – read on past the break!
Within EMC, the implementations on VNX and VMAX differ a bit. If you look at the VNX XCOPY implementation (and there’s a lot of work on this for “future” VNX software releases), a big part is the transfer size of IOs when XCOPY is used. It can result in “extra” work in some cases based on the internal implementation (4MB IOs get chopped up). These internals folks are deeply tied to a lot of things, and not trivial to change (remember that messing with a storage stack has the rules of “huge, material unintended consequences” – a “bad day” in storage land is a really, really bad day).
In the VMAX case – larger transfer sizes can help – as the VMAX can absorb the larger IOs through it’s stack, and it has a “extent buffer limit” of operations where it will revert to the traditional software datamover code in vSphere.
Internally the vSpecialists just got a heads up from Cody Hosterman (thanks Cody!) that we finally got approval from VMware to allow us to recommend customers to change the MaxHardwareTransferSize for XCOPY on the ESX hosts when using only a VMAX. I’m sharing Cody’s content with the world, as hey – we’re open :-)
Making this change leads to almost consistently 4X improvement in performance in XCOPY rates. VMAX in general is better at handling larger extent copy sizes than VNX, but more importantly we won’t hit the extent limit so quickly and revert back to software copy which is what really slowed us down (instead of having around a 100 GB queued XCOPY limit it will be closer to 400 GB ). See example chart below (a VM with a 40 GB virtual disk and 25 GB of data on it):
Likewise, svMotion gets a hand here too…
The commands below are how you query the parameter, and how you adjust it. Remember – ONLY DO THIS IF YOU ONLY ATTACH YOUR HOSTS TO VMAXes! You can always use the “KISS” axiom and “leave things at defaults” (often operationally the best – and protects you if you change your kit often, or are very heterogeneous)
# esxcfg-advcfg -g /DataMover/MaxHWTransferSize
Value of MaxHWTransferSize is 4096
# esxcfg-advcfg -s 16384 /DataMover/MaxHWTransferSize
Value of MaxHWTransferSize is 16384
There are a couple other things to know when using VMAX with VAAI XCOPY…
- A Symmetrix metadevice that has SAN Copy, TimeFinder/Clone, TimeFinder/Snap or ChangeTracker sessions and certain RecoverPoint sessions will not support hardware-accelerated Full Copy. Any cloning or Storage vMotion operation run on datastores backed by these volumes will automatically be diverted to the default VMware software copy. Note that the vSphere Client has no knowledge of these sessions and as such the “Hardware Accelerated” column in the vSphere Client will still indicate “Supported” for these devices or datastores.
- Full copy is not supported for use with Open Replicator. VMware will revert to software copy in these cases.
- Although SRDF is supported with Full Copy, certain RDF operations, such as an RDF failover, will be blocked until the Symmetrix has completed copying the data from a clone or Storage vMotion.
- Using Full Copy on SRDF devices that are in a consistency group can render the group into a “sync in progress” mode until the copy completes on the Symmetrix. If this is undesirable state, customers are advised to disable Full Copy through the GUI or CLI while cloning to these devices.
- Full Copy will not be used if there is a metadevice reconfiguration taking place on the device backing the source or target datastore. VMware will revert to traditional writes in these cases. Conversely if a Full Copy was recently executed and a metadevice reconfiguration is attempted on the source or target metadevice backing the datastore, it will fail until the background copy is complete.
Anytime I do a post like this – inevitably, competitors (as we are the 800lb gorilla) all come out swinging. Personally, I don’t care. In my view, if we as EMC stay focused on our customers and develop great technologies – the competitors are not a problem. The other way of looking at it – all this info highlights how many customers using EMC and VMware together (and how much we invest in developing, refining/optimizing for the use case).
What about VNX? We did some preliminary testing on VNX and we saw performance degradation when upped to 16MB. Since VNX breaks the XCOPY requests into 512 KB chunks and processes 8 at a time, with a 4 MB Max Transfer Size the VNX will process the entire XCOPY request at once. So most likely increasing the MaxHardwareTransferSize will not help VNX, as the remaining chunks will be queued—which has been supported so far in our limited testing. We are looking into testing VNX with smaller MaxHardwareTransferSize sizes though. Lots of re-writing/optimization/evolution of the VNX block and file stacks is going on (adding a lot of new coolness, but also giving us a window to attack XCOPY)
For those of you wondering, yes, I’m still pushing (though it feels like I’m pushing a rope) to get VMware to change their support matrix to show support BY offload (so people don’t turn them all off).
Anyways, the link to the published VMAX paper is below:
THANK YOU CODY, and enjoy customers – feedback always welcome!!