Well – in a nutshell – more, for less.
The vStorage APIs for Array integration (or VAAI) have been something that we previewed back at VMworld 2009.
Now that vSphere 4.1 is officially out, we can now talk about it without tapdancing around a lot of stuff.
I did a webcast on this topic the week before last. I needed to step carefully around not saying “vSphere 4.1” or committed dates/functions, but now you know… All the info on the webcast was technically accurate, and now officially decoded :-)
The topic is very interesting and the effect of these hardware acceleration offloads can be very pronounced.
Just one example (of many)… using the Full Copy API:
- We reduced the time for many VMware storage-related tasks by 25% or more (in some cases up to 10x)
- We reduced the amount of CPU load on the ESX host and the array by 50% or more (in some cases up to 10x) (by reducing the impact during the operation and reducing the duration) for many tasks.
- We reduced the traffic on the network by 99% for those tasks. Yes you read that right, 99%. During these activities (storage vmotion is what we used in the example), so much storage network traffic can be so heavy (in the example 240MBps worth of traffic), it impacts all the other VMs using the storage network and storage arrays.
And like Intel VT with vSphere 4, these improvements just “appear” for customers using vSphere 4.1 and storage arrays that support these new hardware acceleration offloads.
As cool as the Full Copy API is, I think hardware accelerated locking will have as much of an impact. Locking is thrown out by protocol passionistas as a “why NFS is intrinsically better than VMFS” - which I don’t agree with. There are differences which mean for some use cases, NFS is a fit, and for others the VMFS is a fit – the only thing that is always true is it’s handy to have both as options. I’ve talked about this at length, and if you want to understand more, read here and here.
In my experience, for most customers – they never experience any issues with VMFS locking. In fact, it’s “invisibility” is a good thing – VMFS remains in my experience one of the simplest distributed filesystem implementations. BUT, it can be an issue in certain use cases (and stuck locks suck). Ideally you wouldn’t have to have any considerations for a datastore other than “is it big enough, and does it deliver the performance the VMs need, and does it do it as efficiently as possible”. That is what the hardware accelerated locking does in vSphere 4.1. Literally metadata updates have no impact on other things using the datastore (VMs or ESX hosts). Mucho cool.
To understand more, and see resources to help visualize (and also a demonstration) – read on!
Where do these integrations occur?
Well, here’s my diagram of the places where storage can integrate with vSphere. The parts affected under the topic of VAAI have red boxes around them.
To understand the dialog that goes with this, listen to the webcast (link below). As a point of note – while all vendors are working hard to integrate with VMware (which is good – highlights the importance of VMware in customer environments), to date, as far as I know, EMC is the only single vendor to have products available that integrate with each of the areas in green. BTW - “co-op” means it’s an area where it’s not an integration API per se, but that there is a lot of cooperative development.
I’ve decided to upload the presentation directly in PPT format (with some of the “vSphere.ahem” stuff now correctly showing vSphere 4.1) – I figure it will help customers and EMC partners better than in hamstrung PDF format. Sure, my competitors will get it, and steal a few slides here or there, but hey – who cares, they’re welcome :-)
The presentation has some slides that when put in presentation mode, show the detail of what’s occurring “with/without” the APIs in vSphere 4.1. You can download the FULL presentation here.
You can also watch the recording of the webcast here.
The demonstration of one of the hardware acceleration offloads (the “Full Copy” API) during a Storage VMotion operation I decided to post also (thanks to the EMC Cork Virtualization Solutions crew for doing some of the key work – fellas, you rock!):
You can download it in high-rez MOV format here and WMV format here.
It’s notable that on NFS, EMC (and NetApp) can today get a similar effect (better in some ways, worse in others) to the Full Copy API (done differently) – where the array does the snapshot of an individual file (managed via a vCenter plugin), and then via vCenter APIs the VM is customized and registered.
Ultimately, as you see from the integration diagram, at EMC working to extend future vStorage integration (in future vSphere releases) to leverage NFS datastores (the VAAI support in vSphere 4.1 is block only). This would manifest itself in VMware operations being hardware accelerated without the need for vCenter plugins (like VAAI today, it would all be “under the covers”)
What do you need to benefit from this hardware acceleration of storage functions?
- Well, of course, you need vSphere 4.1. VAAI is supported in Enterprise and Enterprise Plus editions.
- If you’re an EMC Unified (an EMC Celerra purchased in the last year – NS-120, NS-480, NS-960) or EMC CLARiiON CX4 customer, you need to:
- Be running FLARE 30 (which also adds Unisphere, Block Compression, FAST VP aka sub-LUN automated tiering, FAST Cache and more). You can read more about all the FLARE 30 goodness here and here if you’re interested in more detail about what’s new in that release. FLARE 30 is going to be GA any day now…
- Also, ESX hosts need to be configured to use ALUA (failover mode 4). If you’re using a modern EMC Unified or CLARiiON CX4 array, using ALUA (with the round robin PSP or PowerPath/VE) with vSphere 4.0 or vSphere 4.1 is a best practice (for iSCSI, FC and FCoE). We will be automating configuration of this shortly in the always free EMC Virtual Storage Integrator vCenter plugin, but for now it’s pretty easy to setup manually.
- If you’re an EMC VMAX customer – it will be a bit longer – but not much, but VAAI support is in the next major Enginuity update scheduled for Q4 2010.
- It is supported on all block protocols (FC, iSCSI, FCoE)
- When does a VAAI offload NOT work (and the datamover falls back to the legacy software codepath) if all of the above are true?
- The source and destination VMFS volumes have different block sizes (a colleague, Itzik Reich, already ran into this one at a customer, here – not quite a bug, but it does make it clear - “consistent block sizes” is a “good hygiene” move)
- The source file type is RDM and the destination file type is non-RDM (regular file)
- The source VMDK type is eagerzeroedthick and the destination VMDK type is thin
- The source or destination VMDK is any sort of sparse or hosted format
- The logical address and/or transfer length in the requested operation are not aligned to the minimum alignment required by the storage device (all datastores created with the vSphere Client are aligned automatically)
- The VMFS has multiple LUNs/extents and they are all on different arrays
So –add this VAAI support to the long list - one more EMC/VMware technical integration (we have 58 at last count!!)
I’m sure that I’ll be asked left right and center… “which other storage vendors support VAAI?”. I’m overt about the fact that I work for EMC, but try to make this a useful blog for all. Unfortunately, on this – the right answer is: “talk to your storage vendor”. I also expect everyone to talk about it and issue press releases. Eventually, I’m sure that these features will be universal (to varying degrees and with varying implementations).
- A good guidepost for customers would be your vendors Site Recovery Manager support. There were a huge number of vendors pointing to “we stand behind SRM!” at launch who didn’t actually have SRAs for a year :-)
- Another good guidepost is if you look back at the VAAI sessions at VMworld 2009, it’s a good signal to the folks who seem the most active in the development of all the VAAI goodness with VMware.
- Another tip - don’t listen to anyone other than your vendor on a technical topic (a competitor will always throw the other guy under the bus) - simply ask your storage vendor to say “supported, or will be supported on this date”… and I would suggest that you ask for a demo – if it’s going to be GA in 6 months or less, it’s running in a lab. If they can’t show it to you (and give you specifics), I would would start wondering.
Speaking of more for less, while this is a technical post (and I try to keep marketing off) – this spoof video with Erik Estrada (what a good dude!) was too funny (and too apropos – after all, we’re talking about efficiency, and doing more with less!) to not post.
So… put another way - “In a nutshell – what does vSphere 4.1, VAAI and EMC support mean for you?”
What Intel VT did for compute in vSphere 4….
…VAAI-enabled EMC arrays do for storage in vSphere 4.1
So – what do you think about vSphere 4.1 and VAAI (I think it’s cool!)? What would you like to see next?
This is a great feature my only wish is that more vendors *cough* 3Par *cough* would support this.
Sorry Mr farley this is def 1up for EMC :P
Posted by: Roggyblog | July 13, 2010 at 08:23 AM
@Roggyblog, have you seen this: http://www.3par.com/news_events/20100713.html
Thanks for the great write-up Chad, that's helped make things a little clearer for me.
Posted by: Simon Long | July 13, 2010 at 09:05 AM
Chad ... check the link again for powerpoint. I am getting a resource not found error.
Posted by: CC | July 13, 2010 at 10:20 AM
disclaimer: I work for EMC - but you know that Simon :)
I see 3Par announced the *development* of the plugin. Does that mean it is GA? Looks like a alignment announcement.
Posted by: Nicholas Weaver | July 13, 2010 at 10:25 AM
I had not seen that..thats great news!
Posted by: Roggyblog | July 13, 2010 at 10:55 AM
@CC - thanks - link should be fixed now... thanks!
Posted by: Chad Sakac | July 13, 2010 at 11:43 AM
Craig, the ppt link is ok now :)
Thanks Chad, really useful to explain that to my customer but in Spanish.
Posted by: MauroAyala | July 13, 2010 at 11:47 AM
Sorry Mr. Roggyblog, but please check out my blog post today on StorageRap. If you don't know how to find it I'll post a link, but I don't really like link-dropping in comments.
Yes, thanks for the good write-up Chad. There is a lot to array integration. One fine point I'd like to make copying EZT to thin works very well on 3PAR arrays with zero detect - no problem, without an expansion in capacity.
Posted by: marc farley | July 13, 2010 at 02:13 PM
@ Marc - great to see 3PAR working hard to integrate with VMware, good competition is always respected and welcomed. The note of how EZT on 3PAR works differently is a good one.
I would of course note that an array using the Hardware Zero acceleration can now choose to ignore SCSI WRITE commands that are issued as SCSI WRITE SAME in theory :-) This means that the whole "catch zeroes before they are written via an ASIC" is no longer a unique thing in the VMware use case :-)
While EMC arrays have a "zero reclaim" for things that have already been written, in the VMware use case, the VAAI stuff actually makes us able to avoid writing the zeros in the first place.
So, the "gotcha" really applies only to legacy EZT VMDKs, not new ones.
Posted by: Chad Sakac | July 13, 2010 at 06:23 PM
--- NetApp Disclaimer ---
Chad,
Solid post with lots of solid info on VAAI. It's nice to see the joint engineering efforts of the industry come together to enhance SAN technologies. I think its very honest to say that with the first release of VAAI VMFS becomes more like NAS in terms of integration and scaling with VMware.
I do disagree with some of the 'mis-information'in shared in your post when you discussed NFS. Are your critical views representative of differences in capabilities with SAN and NAS arrays from EMC?
In order to not muddy up your blog, I'll share my thoughts for those interested here:
http://blogs.netapp.com/virtualstorageguy/2010/07/vsphere-41-and-vstorage-apis-for-array-integration-vaai.html
Again, very solid post.
Posted by: Vaughn Stewart | July 14, 2010 at 12:37 AM
Vaughn, Are you saying that VMware's own VMFS file system is not as well integrated with vSphere as Netapp's NFS? If so, I find that slightly hard to swallow. You can make arguments about how well certain functions work in each, but its a bit arrogant to say what you said here. Did I miss news from Netapp today that was something more than an alignment with VAAI at some undisclosed future time?
Nicholas, and speaking of alignment announcements, I didn't respond to your question previously, but the release of 3PAR's VAAI plug-in will be in September.
Chad, the "in theory" part of your discussion about block zeroing meant what? Is there something a little less vague that you can say about EMC's zero-processing technology, such as how it works and when it might be available and on what platforms? Recognizing zeroes as they are read, for instance during a Full Copy operation, and reducing the capacity consumed by the copy is not the same thing as creating an EZT VMDK using Write Same.
Posted by: Marc Farley | July 14, 2010 at 02:25 AM
@Vaughn - my comments were not in any way critical of EMC's NAS implementation, which while not perfect (I don't think ANY product is perfect), I think competes well with any competitor, and it's market growth (50% Y/Y in Q1, Q2 will be announced shortly).
What I meant is that:
I wish there was NFSv4 and NFSv4.1 (and eventually pNFS) support in vSphere so a NFS datastore could be scaled out over multiple ports, multiple processors, and scale-out NAS was possible - like can be done with VMFS today.
I wish that NAS devices (EMCs and others) had failover characteristics that were as consistent under a broad envelope of use cases as block devices (failover can consistently be in tens of seconds, but varies under various conditions and can extend to minutes) - I wish that failover characteristics were not a function of a variety of conditions - like is the case with how block devices and VMFS operate today.
But - those wishes are (today) wishes (and active engineering projects).
Historically. I wished that VMFS metadata update scaling (blow out of proportion in many cases, but valid in some - just like the beefs against NFS by some, some of which I listed above) was more like NFS metadata update scaling - and in vSphere 4.1 it is.
I wished that we could accelerate VM-level copy operations on VMFS like we can on EMC NAS - and in vSphere 4.1, we can.
These things go back and forth, and I could go back and forth with strengths weaknesses - but I don't think that would be too fruitful. You and I (and our respective orgs) are working on updated NFS client work, continuing to tighten failure conditions, and drive vStorage API hardware acceleration support for NFS (and Storage IO Control support). People can move forward, regardless of protocol, with confidence. For the vast majority of use cases (VAST majority) - if that is the debate, wow - IMO, the customer is missing focus on bigger problems.
I'm not a protocol passionista - I believe you get leverage from both protocols.
@Marc - thanks for the question. The caveat listed above is due to a EZT VMDK written on a generic device (for a moment assume this), can't leverage the extended copy command because it's not a 1:1 block range map. If the zeros are never written to disk (but allocated as far as the host is concerned), then EZT is always, thin in practice at the array level. from my understanding (not claiming to be an expert on 3PAR), this is done in the ASIC in your platforms. With VAAI, when the SCSI WRITE SAME command is issued (hardware accelerated zero/init) - in the Enginuity implementation (checking on the FLARE implementation) - the zero is never written (and never needs to be reclaimed), only allocated - so very similar effect (though obviously vSphere 4.1 specific, and the 3PAR approach is general).
Posted by: Chad Sakac | July 14, 2010 at 09:13 AM
Hey how can I create the VAAI filters through the CLI in Vshpere4.1
Posted by: Sameer | July 14, 2010 at 03:59 PM
Chad, I think 3PAR's VAAI WRITE SAME implementations are probably very similar. A 3PAR array doesn't write zeroes and have the ASIC detect them - it just doesn't write them. As for the different topic - full copy of EZT VMDKs to thinly provisioned volumes - I can see where this wouldn't work for most TP implementations because the clones made this way could be huge - especially if somebody made a bunch of clones of large EZT VMDKs. I wrote a blog post on this here: http://www.storagerap.com/2010/07/clarifying-vaai-capabilities-and-implementations.html
Posted by: marc farley | July 14, 2010 at 05:25 PM
Does anyone know if there be VAAI support for EMC NX4 devices. We are a small setup and dont have the high end devices. Just curious if smaller shops will get this feature from the lower end storage units or not.
Posted by: Matt | July 15, 2010 at 10:23 AM
For VMFS "hardware accelerated locking", remember that the new 'Atomic COMPARE AND WRITE' command proposed for SBC-3 has not yet been finalized in the T10 SCSI standards, nor has SBC-3 yet been ratified.
So it's kind of silly to me to go championing this feature, which no hardware works with. Sure, VMware got the EMC parent company to do some early prototyping and add support in a small set. But most vendors won't add support until the standard is finalized (for fear it could change), and then it takes firmware updates to add support... which take a long time. Additionally, people with older arrays could be out of luck, as vendors may choose to not go back and add support.
While interesting, it's not quite the victory you portray (yet)
Posted by: John | July 15, 2010 at 05:44 PM
Looks like the PPT presentation is offline again!
Posted by: Andrea Mazzai | July 21, 2010 at 04:42 AM
Apologies Andrea - ppt link now fixed.
Posted by: Chad Sakac | July 27, 2010 at 02:33 PM
Great post! You've done a great job. I hope this new feature will be available soon in other vendors than EMC
Posted by: Jose Manuel Carballo | August 18, 2010 at 06:22 AM
Great post Chad and thanks.
I was wondering if there were any numbers surrounding how the VAAI features may impact general snapshot performance.
I can imagine how the hardware assisted locking might improve a scenario where there were several snapshots concurrently on the same volume.
However, it's not clear to me to what extent VAAI might improve a scenario where a large log/delta file is being merged on a snapshot close for a single VM. The HA-zeroing might help somewhat I am thinking, based on what the VMDK looks like? Thanks!
Posted by: BlueShiftBlog | August 21, 2010 at 09:27 AM
One feature that I would like to see, if it's not present in 4.1 and VAAI, is have ESX send 'zeroize' commands to the array whenever a VMFS block is no longer assigned to a VM. Why? If I vMotion a 200GB VM from LUN1 to LUN2, I want the array to reclaim all of the previously allocated storage on LUN1. The only way the array will know the data is no longer valid is if VAAI tells the array to zeroize the blocks.
Another feature VMware needs to implement is Guest OS integration with VAAI such that when NTFS deletes a file that VAAI tells the array xx blocks aren't needed and zeroizes them. This is the same concept that the Veritas SAN volume manager uses with some arrays, such as the 3PAR to reclaim freed NTFS space.
If VMware implemented both of these features then VMFS LUNs hosted on 'advanced' arrays would stay THIN throughout their entire lifecycle. That would eliminate the need to use sdelete or fill up a VMFS volume with an eager zeroed VMDK to trigger the array's zero reclaimation features.
Posted by: Derek | August 21, 2010 at 11:55 AM
Chad, actually happened across this while reading about the 10 GigE stuff today. Since you mentioned other vendors in the comments, thought I'd point out that HDS also supports VAAI on the AMS2000 line with firmware 0893/B or higher. We've got more detail at http://storagemeat.blogspot.com/2010/08/geeking-out-with-vaai.html.
Thanks!
Posted by: Storagemeat.blogspot.com | August 24, 2010 at 01:14 PM
Chad,
VAAI is supported on CLARiiON with flare 30 and with VMax in Q4 this year, but any news when VAAI is supported with VPLEX?
Posted by: Tomi Hakala | September 02, 2010 at 01:28 PM
I did some performance testing of VAAI with a 3PAR T400 array, and wow, 3PAR screams when creating zeroed VMDKs. VAAI performance is 20x faster than without VAAI.
http://derek858.blogspot.com/2010/12/3par-vaai-write-same-test-results-upto.html
Posted by: Derek | December 05, 2010 at 02:31 PM
Hi Chad,
I would like to use some of your slide in a post about VAAI on my blog In particular I would like to use the pictures about storage API with/without VAAI, translating them in italian (the post will be in italian).
Can I use them? (giving credit to you of course!).
Thanks
Giuseppe
Posted by: Gguglie | March 31, 2011 at 02:23 AM