It’s kind of funny – in our small but vibrant VMware community, there is totally a certain “zeitgeist” where people simultaneously think about topics. Look at this:
- I have a tickler on my calendar that said “do a post on the new vMSC category”
- Today, I see that the always excellent Duncan Epping did a great post on vSphere 5.0 HA and metro/stretched cluster solutions.
- There’s also Lee Dilworth from VMware and I presenting on this topic at VMworld 2011 (and updating for VMworld 2011 in Copenhagen Oct 18-20) here.
- The always excellent Scott Lowe did a post on his Updated Stretched Cluster presentation
- I get an email internally where a customer is confused on this topic…
So – what’s driving this “synchronized zeitgeist”?
Answer – a lot of internal work that’s coming to a head, more and more stretched cluster deployments = these use cases are now becoming much more mainstream, and simpler.
1) There is a new VMware HCL category “vSphere Metro Storage Cluster”. This is the result of a LOT of work. I’ve created a link to it here. This is interesting – in the sense that prior to this, while there was a lot of vendor work, info, KB articles for this, there was no formal test harness for the configurations, and failure scenarioes associated with stretched vSphere clusters (look at Duncan’s post for a view of some of the major failure scenarioes). Together, we built a whole test harness that is now the standard for this, and will be very useful going fwd.
2) The VMware KB articles around this use case are now a LOT simpler. This is due to a ton of the changes in vSphere 5 HA and DRS behavior that Duncan points out. You can see the udpated KB article by clicking below. Remember, DRS host affinity rules are now supported in this use case when using vSphere 5, so go ahead an use ‘em.
3) Lots of work on hardening EMC VPLEX around this use case… (more combined zeitgeist – just in from the product team)
- GeoSynchrony 5.0 introduced the VPLEX Witness: Allows for discrimination between site failure and site partition which means that I/Os are automatically enabled on whichever side survives a failure event.
- update: Note – VPLEX GeoSynchrony is the software that runs VPLEX. VPLEX has 3 major operating modes for every device it presents – Local (in one datacenter), Metro (stretched across synchronous distance) and Geo (stretched storage models across asynchronous distance). As of right now, vMSC is not supported with Geo – only with Metro. VMware/EMC continue to work on this, perhaps I’ll do a post on the technical challenges. It works while there is no partition, but the partition scenarioes are very, very funky indeed, and EMC and VMware agreed that we want to keep it simple for now – Metro-distances only.
- VPLEX in GeoSynchrony 5.0 changed I/O Suspension behavior. vSphere 5.0 now recognizes this behavior and recognizes this as a PDL condition in vSphere 5 Net result when a split brain condition does occur, vSphere can now determine which side of a VPLEX is alive and therefore, restart VMs (and I/Os) on that side.
- Note - this still isn’t working 100% right, and will be updated in a future vSphere patch. Short version, the "vm kill on PDL" doesn’t work perfectly, and still results in APD in some corner scenarioes. We’re working on this very closely, but people can move forward with confidence without waiting. BTW, before you ask (a good question that was asked internally) – does enabling VM Failure Monitoring in vSphere HA fix this behavior? = no
Short version – look at the “what way should you go forward” guidance in Scott’s presentation, or BCO2479 (Lee and I) to determine the right solution for disaster avoidance/disaster recovery (vMSC or SRM). If vMSC is the right answer, move forward with confidence – it’s getting better and better every day.
Also, know we’re not stopping here. Lee and I discuss some of these things in BCO2479… Together, EMC and VMware are looking at how we could “surface” site bias via VASA, how we could make this automatically configure things like host affinity rules. We’re working to make multi-site scenarioes (rather than 2-site) work better, including hybrid solutions of vMSC for two metro-separated sites and 3rd async sites with SRM, as well as future multi-site vMSC scenarioes. Lots of great stuff!!!
Comments and input welcome – are you using or contemplating stretched vSphere clusters?
Hi Chad, thanks for the post, VPLEX is definitely an awesome technology! The information out there is a bit confusing...
The VMware KB states the configuration was tested on ESXi 4.1 and without a VPLEX Witness, yet the KB then goes on talk about the witness, its shown in the solutions diagram and discussed in the failure scenarios.
The HCL lists VPLEX as supported, but only with ESXi 5, even though the KB states it was tested with ESXi 4.1
So my question(s) are:
What versions of ESXi support VPLEX Metro, is the witness supported?
What zoning configurations are supported, cross-connected fabrics, isolated fabrics (I know EMC supports both, but what does VMware support)?
Posted by: Josh Coen | October 06, 2011 at 09:18 PM
Nice article. I'm sure you have seen this, but did you know that metro/stretched clusters are not part of EMC's focus, like no focus at all?!?
http://media.netapp.com/documents/ar-midrange-storage-q32011.pdf. (page 7)
Thought that was kinda funny :)
Posted by: winter | October 10, 2011 at 05:02 PM
@Josh - VMware only explicitly supports the configuration noted in the vMSC - vSphere 5, with a witness, with a non-uniform access model (isolated fabrics). It's notable that other solutions obviously work (as noted, stuff people are using like NetApp Metrocluster, HP lefthand and VPLEX cross-connected fabrics), obviously WORK, but for various reasons aren't on the vMSC. They very well may be in the future. That means that VMware will point to the vendor on support cases - which may be perfectly OK for a given customer - it's just important for customers to understand that.
@winter - that's EXACTLY why I tell my team (and as many EMCers as I can) try your DARNEST to not speak negatively about the other guy - and MAN those "check box lists" (which everyone can't seem to resist) are the worst of the worst. It's darn hard to stay on top of what YOU do, and anyone has a ZERO percent chance of knowing someone else better than themselves.
It's worth pointing out: EMC can support async stretched storage clsuters configurations (in some use cases). EMC can support scale-out stretched storage clusters. EMC has a ton of resiliency both locally and remotely. While there's more we need to do - I feel we're pretty "all over" the stretched storage use case.
Pretty funny!
Posted by: Chad Sakac | October 10, 2011 at 05:22 PM
Nice article, and I also were at the presentation in Copenhagen.
I was one of those bastards rasing my hand when you asked about Lefthand customers :-) You said that HP Lefthand was Non-uniform storage, but I think it really depends on which level of redundancy you configure it for. In our case, we use Network RAID10 wich makes all the nodes active in our case, and the ESX-hosts can access storage boxes on each site.
We implemented this solution in January using vSphere 4.1 Update 1, and it has been working great the entire time. Didn't know it was an unsupported setup until I attended your session, so now we will speed up the 4.1u1->5.0 process so we have a (more) supported setup.
Basically, we have three datacenters in this setup, they are all connected with redundant dedicated black fiber links (lit up with 10Gbit links).
In Site A, we have ESX-hosts 1,3,5,7,9 and storage boxes 600-1, 600-3, 1000-1, 1000-3. In Site B, we have ESX-hosts 2,4,6,8,10 and storage box 600-2, 600-4, 1000-2, 1000-3. In site C, we have our vCenter and our FOM (Fail Over Manager, which would be the witness server). We have lost the storage in Site A without any problems (we pulled the plug on all the storage boxes in Site A to test), and we have done planned maintenance on Site B without anything going down (maint.mode on esx hosts, migrate VMs to Site A with vMotion). Affinity rules for the VMs make sure they are in the right DRS cluster.
I hope this setup will be supported in the future by VMware as well, but since we are using best practice from HP we don't see any problems at the moment.
I like the VPLEX concept, I think we might look in to it in the future.
Thanks for a great session! :)
Posted by: Apaulsson | October 21, 2011 at 04:50 PM