Ugh - yesterday was a tough day. My team is a global team, and the phone started ringing and emails started pouring in as soon as clock ticked over to Aug 12th in Australia. The only reason I haven't posted earlier was I was up to my eyeballs.
The issue is in this KB here While it was rough on me as a VMware partner and as EMC as a VMware customer (you can see from my other posts we have thousands of ESX servers, and while production goes through a change control process and is on 3.0.2 and 3.5u1, our solutions validation and demonstration labs update as fast as possible) - people at VMware are in agony.
I think a lot of people in the blogosphere are extremophiles, defined this way:
Many of the comments on VMTN just break your spirit a bit until you harden yourself. Anger/frustration are reasonable, but anonymous posters and fan-boys, I have little patience or respect for you (though I doubt you care) Strong feelings are good - passion is good, but don't hide if you feel strongly about your position, and you think it's reasonable.
I'm getting hardened to it a bit - along with Intel, Cisco, Microsoft and other big players, EMC get's a lot of flak from these kind of folks - people want the small guy to beat the big guy, and revel in it, and like slinging mud.
VMware is a weird exception - they are the 800 pound gorilla of their space, but it happened so fast they maintain the "startup" halo around them. This inevitably changes as growth happens, they solidify themselves as a key in the new IT infrastructure space, and new startups work the david/goliath story humans love.
Here's one person's opinion - no more, no less.
- This sort of happens to all companies - great/weak, small/large. I'm not blasé about it, it's a terrible thing. The real test is how the organization deals with it - are they open, transparent, fast to fix, and afterwards, do they improve their process?
- If it happens with ANY frequency, regardless of the company, there are serious problems - we are all only as good as our products.
- The way to reduce the risk to your IT infrastructure from this if it really is inevitable is to implement change control. A classic example is that Enginuity is currently at 5773, but DMXes ship with 5772 code for now. The same holds true with all our platforms - there's a window we called a PPR - "Phased Product Release". Even long established processes, we've STILL been hit (there was a similar date-based bug in the Kerberos code used by our CIFS server a while ago). Customers implement their own version control on top of what we do. The only reason customers hasn't been behaving that way with VMware has been that their software quality has been SO good.
Let's look at this case:
- VMware has clearly been open, transparent, and clearly aggressive with their fix, which was made available late in the same day.
- This is a first for VMware. A bad first no doubt, but still a first. The question is whether the first of more to come, or an isolated incident. While I may be biased, I think they deserve the benefit of the doubt based on their history of quality.
- Every customer needs to apply good IT best practices on change control of all parts of their mission-critical infrastructure - VI is no different, and arguably one of the most important ones to do this with - because, like core network switches and storage infrastructure - it's hyper-consolidated.
What was your experience, and what are your thoughts? Extremophiles welcome, but rationalists desired :-)