« There's fud, and then there's FUD | Main | So, What Kind of Investments is EMC Making in VMware? »

June 12, 2008


Feed You can follow this conversation by subscribing to the comment feed for this post.

Stephen Foskett

Chad, I just wanted to welcome you to the blogging world and say "wow"! You're putting up some great information and I for one intend to read every word of it. So thanks, man!


Chad, another outstanding post. You have quickly made the short list of blogs that I check on a regular basis. I thoroughly enjoy your insight on the topics you have covered so far.

I went to EMC World with an interest in learning more about SRM and came away from there with SRM at the top of my list solutions I am most excited about for our environment.

It can be a very dangerous game that some vendors play when they pitch products w/o thinking about or just w/o caring about how they fit into to a customer's environment as a whole. It's a recipe for a solution sale with a short lived customer relationship.

Thanks again for the great insight.


I have to agree with most of your reasons. However the simplicity of having one cluster seems almost to good to pass up and might even make it worth it. We are a minor site and we will probably only have a dozen ESX boxes and for us there is no problem with having a second site with dark fiber connectivity. It's not even expensive for us... The storage side actually might be doable using NetApp and their Metrocluster setup.

However I do really worry about a split brain!

SRM is not trivial either and in case of a failure then you have to worry about doing a manual failback and this step is not that easy to test in production!

Chad Sakac

Janake, thanks for the post.

Metrocluster is another of the technologies on the market like some of the other examples I mentioned that can do this.

In that case, dark fiber is needed, becuase you cut the FAS filer in half and stretch the backend enclosures (in effect) between sites. It leverages the way that NetApp does failover (starting the failed filer as a JVM on the other head), and using SyncMirror to have a copy of the data (that's the simple stretched Metrocluster case - 500m is the target distance), or stretching via FC switches (up to 100Km).

Metrocluster is neat and does work - but not for the faint of heart (like most of these stretched storage configurations including EMC's solutions). I'd invite any of my NetApp coleagues and readers to post their perspective.

In the case of a disaster - you will not be able to VMotion (since the source server will be down), you will be doing a mass VM HA operation, which will take as long as SRM would, so there's no Recovery Time Objective (RTO), or Recovery Point

I think you'll find that the MUX gear needed to go over your dark fabric (in any case unless you're literally just stretching the cables), the FC switches needed for the Fabric Metrocluster, and Metrocluster licensing is likely FAR, FAR more expensive than the SRM direction with Snapmirror (if you HAVE to use NetApp :-)

The other thing that I wouldn't underestimate is thta even with 12 ESX servers - you could be talking about 100's of VMs - SRM automation of startup sequence, reporting and notification is not to be minimized. You will need to do it manually otherwise - just letting VM HA go won't work (startup sequence matters).

This is an example along the lines of what I was outlining. While technically possible, I don't get it.

- No improved RPO/RTO (maybe fractional over SnapMirror, but with your infrastructure, sync replicas would be possible - EMC supports this, but NetApp doesn't support SyncMirror with SRM - perhaps this is why they are pushing you this way?)

- relatively complex storage layer configuration to create the stretched back end. I respect NetApp's simplicity - but Metrocluster is not simple. (http://www.netapp.com/us/library/technical-reports/tr-3548.html)

- Compared with just sync replication + Site Recovery Manager, the SOLUTION (not just getting data there) would involve complex manual scripting for failover and restart - not just at the Storage (NetApp) layer (it's true that failover is simple ONCE you have Metrocluster working), but at the VMware layer (unless you're just going to trust VM HA to start everything together and you won't get services failing all over the place. Of course, split brain is also a possibility.

So - who does this make sense for - you as the customer, or NetApp as the vendor (I would argue neither, as it negates one of NetApps great strengths which is simplicity).

I'd have to imagine that if I were NetApp, **AND** if you are committed to Metrocluster (perhaps it makes sense for non VMware other workloads) leverage NFS datastores - they will be the most proven solution with Metrocluster and NetApp. NetApp cluster failover works well with NAS and works with block protcols as well, but with a few more asterisk's.

I hear you loud and clear on failback. Failback isn't built into SRM v1.0 - but there are several failback mechanisms that are documented in the GA release. As well, EMC has worked above and beyond the basic SRM integration requirements to help the automated failback with several of our replication technologies. It's also not complicated even in the most simple scenarioes - you create a recovery plan in the opposite direction - not so much "failback" as much as "failover again". I will tell you this definitively - storage failback is easy (Metrocluster, or any of the array replication technologies used by SRM). What's hard is VM failback (without SRM).

So - is this a case where you are driving the requirement - or the vendor is saying "hey this would be cool!"? (perhaps because SRM doesn't support NFS yet, or because you have a synchronous requirement?)

With a bow to my respected colleagues at NetApp - whenever I've seen this Metrocluster positioning, it's the latter, not the former. Just because it's technically possible doesn't make it a good idea.


Hello again and thanks very much for your insightful reply to my post and for clarifying a few points. This is one of the great thing about blogs and yours is on my permanent reading list.

We're currently investigating different storage vendors and NetApp is by no means the only one, we currently run the EMC Clariion. Every storage vendor has their gotchas and you clearly pointed out one of the gotchas with Metrocluster.

I might be chasing the perfect dream but I do hope that we some day can have a simple stretched datacenter between different site. But alas not today.

When it comes to IT my philosophy is "seeing is beliving"... ;)

Chad Sakac

Janake - thanks - BTW, I cringe if I came across as "Metrocluster is complex" - it should come across as "all geographically distributed storage" (Invista, Yotta Yotta, Lefthand's distributed clusters, everyone). I'm throwing them all under the bus together.

That doesn't mean they are bad. It means - BE VERY CAREFUL. Some are complex in setup, simple in operation (Metrocluster, Invista, YY), others are simple in setup, complex in operation (LH).

You need to be careful, because in general (not always - sometimes requirements are strict and must be met), the complexity trade off is the wrong way to go.

What happens though, is the vendors don't share "OK, this is what it's really going to take", and "OK, and this is how it's actually going to work".

Everyone is quick to point out the flaws in others. Trust the vendor who points out flaws in **themselves**, and guides you as a customer on how the SOLUTION would work (flaws and all).

Thank you for being an EMC customer, and if you want to try MirrorView/S across that link with SRM, I'm happy to help you! Regardless - share your experience!


Hey Chad, great post. Only gotcha right now is that MVA isn't supported with SRM. I've sent some mails to a program manager who works with the Clariion developers but she said she was 99% sure it wouldn't be done before Q4. Anything you could do to help speed this up would be greatly appreciated. Most of our customers want some serious distance between sites and don't want to have to buy a Celerra to do it.

Chad Sakac

Ed - thanks for the comment.

The MV/A support in the CX SRA isn't a lot of work - but right now it's stacked behind a couple things that will be obvious in a little while. We'll get it done as fast as we can.

Rationale in the mothership was that with Recoverpoint now supporting the integrated splitter, RP (which is MUCH more feature-rich, support long-distance async configs, adds continous dataprotection and very good WAN compression) isn't much more expensive than RP. You don't HAVE to have a Cisco MDS switch or Brocade intelligent fabric anymore - the CX can split the I/O internally.

Still - I hear you, and it's been a loud drumbeat. Partners and customers steer our direction - we'll speed up straight up MV/A support.

Chad Sakac

Hey, quick update here - MV/A is coming in the Sept update - and for EMC customers who need it now, contact your local EMC team or me, happy to help you immediately!


A very insightful article on a very complex topic, thanks Chad. With VMware's direction pointing to introducing more and more VCENTERS in the environment (e.g. servers, VDI, maybe some Lab Manager, all falling under an SRM framework) ... it does make you wonder about all the implications of stretching as you've pointed out.

Which for me boils down to - At least with cluster+VCENTER being bound to a site, you always know where things are at. Let's face it, sometimes HA does not completely live up to expectations, so if you don't have a VCENTER Cluster in place, you're environment will be invisible until you get VCENTER UP.

I really like your idea of adding more site awareness into VCENTER. To extend upon that, I'd also say yes to giving it Active-Directory style multi-master redundancy. So in day-to-day operation the site's VCENTERS are responsible for looking after its local hosts, but in event of a failover the role can be run from another VCENTER in the forest. This of course would be dependent on networking, but for simplicity let's presume all VCENTERS have connectivity to all ESX hosts in the whole environment. Also it would give the ability for cross-vCenter clusters to be setup. Perhaps these things could be called Meta-Clusters.

Perhaps VMware will facilitate this, and then rely on VSTORAGE and the storage vendors to write more VCENTER plugins. E.G. Giving VCENTER "Metro-Cluster" site-awareness and also the ability to sync this awareness across multiple VCENTERS. How cool would it be to be able to evacuate a building by evacuating that half of the cluster, putting all these hosts into maintenance mode, and then cut-over the storage, and then cut over to the other VCENTER.

This would all be a major change to the way VCENTER works, but that's also my utopia.

The comments to this entry are closed.

  • BlogWithIntegrity.com


  • The opinions expressed here are my personal opinions. Content published here is not read or approved in advance by Dell Technologies and does not necessarily reflect the views and opinions of Dell Technologies or any part of Dell Technologies. This is my blog, it is not an Dell Technologies blog.