Ok - I lay claim to the first official SRM production failover! SRM GA'ed on friday last week. Find out more here (including an eval!): http://www.vmware.com/products/srm/
LOL - you just can't make this stuff up....
So, today is the GA of Site Recovery Manager - you can download a 60-day eval here: http://store.vmware.com/servlet/ControllerServlet?Action=DisplayPage&Env=BASE&Locale=en_US&SiteID=vmware&id=ProductDetailsPage&productID=105156500
Note that you will get access to the SRA (the array vendor integration part) near the tail end of the checkout.
Three important things I would guide any customer:
- Check the support matrix here: http://www.vmware.com/pdf/srm_10_compat_matrix.pdf (Not to be a weenie about it - but remember my "investments" post here: So, What Kind of Investments is EMC Making in VMware- - take a look at the support matrix. EMC has 23 supported configurations. The next closest is 8. Now, of course, others have one platform, we have many - but that fits our philosophy that customers are different, and means we need to do more work to cover a broader set)
- If you're a CLARiiON customer - only MirrorView Sync is supported - for now. MV/A is coming in Q3 (soon)
- If you're a customer who has a VMware team and a Storage team - don't eval this without talking to each other. If you're that type of customer and are in a VMware team that just wants to see how SRM works and play with it (without talking to the storage team) - use something like the Celerra Simulator (good news coming on that one in a post in the next day or two intrepid readers!)
BTW - Why is it only MV/S for now? Not really a good reason - but here's the honest answer: even EMC has resource constraints. Now that with FLARE 26, there is integrated splitter support, we're seeing a lot of async customers going the Recoverpoint route since it's now a lot cheaper (you don't need an Cicso MDS/SSM or Brocade 7600 or Brocade 48000 with a FA4-18 blade - all great, but none cheap), and it buys you so much more - WAN compression (saves a TON of money), large scale consistency groups and most importantly - continuous data protection. BUT - I've heard the feedback loud and clear from happy MV/A customers in the last 72 hours: "we want MV/A support!". This was always planned, but is slotted behind something else big - but we're buckling down and finding a way to pull it in sooner.
Back to the story at hand.... Read on, the timing is ridiculous!
So, I remote into the lab while on the road yesterday, and all my "production" side network was dead.
I get back to Toronto, and go down to the closet. Quick reminder for those of you that haven't read the post - you can see screenshots here: Building a Home VMware Infrastructure Lab. Huh. The Dell Powerconnect switches are showing a lot of 100Mbps ports, which is weird. Then I notice that my two production Intel ESX servers won't boot (those are the two at the hottest part of the closet right by the vent. Oh crap.
One bad problem with my configuration is that you don't get much of a power supply in a $29 case. And, what do crappy power supplies do? When they get hot, SIZZLE. When you smell smoke, that can't be good. That's what happened. One server was dead, the other was "kind of on", which was worse - in that case the motherboard was cooked too and was transmitting noise on the NIC - which actually caused broader IP connectivity problems on the AMD cluster. My "DR" hosts are on their own switches so they were still happy.
As soon as I removed the flaky motherboard - the AMD cluster came back to life. So, I've got 50% of my "production cluster" back - now what?
I realized I was working with the final pre-GA build of SRM (build 95202) and was replicating between the "Production Datacenter" and the "DR Datacenter" - and just ran the recovery plan...
POOF! Like magic, everything is back to normal (minus my Exchange 2007 server - my DR cluster is on intel E4300 CPUs - which don't have VT so I can't run the 64-bit VMs (I wish Intel would stop it with that being the Q vs. E thing)
So - now I'm rebuilding my Intel cluster. Will be good, as I'll be able to do an update on some parts which are no longer sold (hard to find the motherboards I used, for example). I can tell you this already - the Intel P35 MB booted straight up, not a single issue with the SATA controller. Have it now back in the Intel cluster (one MB is G33, the other a P35 - works fine including Vmotion). Take that mr. "VMware's hypervisor driver depenencies make it harder
Picture of construction in progress.
I'm posting the Celerra Sim and instructions in my next blog post - stay tuned.
I have to say - we've always known how huge SRM is going to be, and to actually use it (as opposed to play with it), just hammered it home for me. Already starting feedback on SRM 1.5 with the VMware team. This also gives me a chance to play with some EMC SRM failback secret sauce :-) Also, reinforced an important home-brew PC lesson.... DON'T SKIMP ON THE P/S :-)

Hi Chad,
"EMC has 23 supported configurations. The next closest is 8. Now, of course, others have one platform, we have many - but that fits our philosophy that customers are different, and means we need to do more work to cover a broader set)"
What that tells me is that I would have to have like to like configurations across sites. How would SRM work with a DMX in production and a Clariion at the Recovery site...with RecoverPoint deployed.
Great blog!!!
Cheers
Nick
Posted by: Nick Triantos | June 24, 2008 at 02:29 PM
What I want to know is how you get the hardware repairs through your purchasing department? Surely your CFO forces you to cost-justify the cost of your new hardware against the new dishwasher, microwave, or washing machine!?! :-)
Posted by: Chas Hockenbarger | June 24, 2008 at 05:26 PM
Nick - thanks for the comment. I'm an avid reader of yours as well (I'd encourage others to check it out - http://blogs.netapp.com/storage_nuts_n_bolts/)
(I wish I could make mine look half as sharp as yours :-)
I'm doing a post on our "right-around the corner" Replication Manager release (adds similar functionality as the soon to be released SnapManager for Virtual Infrastructure, and I hope you'll not mind if I use your great SMVI post as a template of a good posting (I'll of course credit you ;-)
Answer is that homogeneous ("like to like") configurations across the families (Celerra-Celerra, CLARiiON-CLARiiON and DMX-DMX) are supported.
Recoverpoint also has a Site Recovery Manager SRA (see table 11 in the compatibility matrix, so you can use it in exactly the heterogeneous use case that you describe (DMX-CLARiiON).
If anyone wants to see Recoverpoint in action (it does a lot more than just replicate the data also), there are constant webcasts (as there is for all things VMware/EMC) here: http://info.emc.com/mk/get/RE?reg_src=web&P.ctp_program_execution.Source_ID=AMA00006771
There's also a customer that was using it pre-GA in the beta here:
http://www.emc.com/collateral/demos/microsites/mediaplayer-video/video-hosted-solutions.htm
Right now, we qualed it with IBM as well - but in theory, it is totally heterogeneous, and we are working to qual with all of the serious storage players, including NetApp.
Posted by: Chad Sakac | June 24, 2008 at 05:48 PM
Thanks Chad. I saw the IBM DS entry in there. Sticks out like a sore thumb
Posted by: Nick Triantos | June 24, 2008 at 06:47 PM
BTW...
Wouldn't DMX production-RecoverPoint-Clariion DR - scenario with SRM require a fabric splitter given that it appears than only Clariion with Flare 26 has an embedded spliter and that the host based splitter requires RDM only on windows 32/64bit systems?
Cheers
Posted by: Nick Triantos | June 24, 2008 at 07:18 PM
BTW...
Wouldn't DMX production-RecoverPoint-Clariion DR - scenario with SRM require a fabric splitter given that it appears than only Clariion with Flare 26 has an embedded spliter and that the host based splitter requires RDM only on windows 32/64bit systems?
Cheers
Posted by: Nick Triantos | June 24, 2008 at 07:51 PM
Nick, you are absolutely correct. With the caveat of the host splitter and RDMs (we don't see too many customers doing that in VMware configs except where they want to replicate very specific VMs and no others - usually for tight WANs), this configuration requires a fabric splitter.
This can be either a Cicso MDS with an SSM; or a Brocade 7600 or a Brocade 48000 with a FA4-18 blade.
Most of the customers like the ones you've described have these enterprise-class switches already, so it's a matter of adding the SSMs or the FA4-18 blades. Heterogenous replication configurations (like SAN virtualization for heterogeneous mobility) is something generally used by larger customers
The CLARiiON integrated Recoverpoint splitter is used for homogeneous CX-to-CX replication scenarioes (with or without Site Recovery Manager) where they want high degrees of WAN compression, very large numbers of consistency groups, and continuous remote replication and local CDP also. I.e. when they want more than they get with the "out of the box" replicaion of Mirrorview. It's also quite price competitive, for a lot of bang for the buck.
Posted by: Chad Sakac | June 24, 2008 at 11:14 PM
@Nick - one followup - it won't be long before even those DMX-CX/NS configurations won't require a fabric splitter.... Hint, hint...
Posted by: Chad Sakac | December 04, 2009 at 09:41 PM
Informative post, this post has created eagerness to go your through upcoming posts.
Keep up good work.
Posted by: victor | April 07, 2010 at 07:43 AM