I saw this post by my respected colleague Duncan Epping at Yellow-Bricks, and it prompted a comment. As soon as I was three paragraphs in I figured “this really should be a blog post”.
Within the next 24hrs, I got pinged from all three continents from trusted sources on the same question – IO scaling and VDI. Over the last week I’ve been deeply engaged with 3 massive VDI deployment projects that are struggling with this point.
I’ve got an interesting (and I think insanely fortunate!) perspective/visibility into what’s going on in VDI. I’m not claiming that it’s better/worse than anyone else – but I get to see VDI projects around the globe, in all sorts of verticals, and at all scales (100’s of desktops up to 10’s of thousands and even hundreds of thousands) as my team of people partner with customers.
Duncan/Richard G asked a good question – why isn’t there more View-oriented blogging? IMHO, View blogging is rare because VDI architectures in general are complex (not that any individual element is complex, but rather the end-to-end picture and client lifecycle of patching and A/V can be complex) and very, very variable from customer to customer (workload type, client type, connectivity type, connection management type, virtualization layer config, storage type/config all vary wildly). This means it takes a bit more time to get your head wrapped around it all – but also suggests that a ton of value could come from more open dialog via blogs.
So… Let’s do some View posts :-) Read on….
(UPDATED – April 2nd 2010 – there is a newer Celerra VSA 220.127.116.111 here)
All – you can get the new Celerra VSA… This VSA is feature equivalent with the just released DART 18.104.22.168.
It has lots of new stuff, including:
It’s been great working with Celerra engineering on this – they’ve committed to cutting VSAs within days of major DART revs. Also, we’re working to automate much of the setup – expect progress on that in Celerra VSAs to come. Other EMC teams are also furiously working on their VSAs, and playing with them internally – I think you’re going to be pleased!
Follow the link for the OVA and the VMware workstation image. Remember to use Intel VT or AMD-V (instructions are at the link)
The next version of the (as always) free EMC Storage Viewer is now generally available! If you’re an EMC customer and not using this, you’re missing out on something that is totally gratis and helps a lot.
To get it, login to powerlink (http://powerlink.emc.com), and then navigate here: Home > Support > Software Downloads and Licensing > Downloads S > Storage Viewer
So – what’s new?
Also – EMC Storage Viewer 2.1 works with Solutions Enabler 7.1.
If you want to see more – either just download it and give it a whirl, or read on!
Fantastic doc on how to harden VMware View leveraging RSA technologies, as well as leverage VDI itself to make implementing data loss prevention easier… is now public!
it’s meaty, includes step by step, what you can report on, how to implement more stringent controls, and even how to troubleshoot…
You can get it by clicking on the doc below… Have fun!
(thanks to the RSA crew for doing the work, but also for making the right decision to open it up as well as posting it to SecurCare Online – which is awesome btw, if you’re an RSA customer make sure you subscribe!)
So, to my VMware-centric readers, you may have no idea what I’m talking about. But if I asked you “don’t you wish Storage did what DRS does?” – well the answer to that would be YES, right?
Fully Automated Storage Tiering (FAST) is the core idea of automatically moving information to the right tier at the right time.
FAST is immediately available accross the EMC platforms. On V-Max, CX4 it automates the reconfiguration of LUNs to optimize against policy. In the V-Max case, it will automatically move and swap LUNs between solid state, FC and SATA. In the CLARiiON case, it will recommend and move from FC to EFD and SATA to optimize the total configuration. On Celerra it will automatically and transparently move files between filesystems/tiers on a platform, between platforms, and actually to Atmos (whether it’s Atmos in your internal cloud or external cloud). Through 2010, FAST use cases will continue to expand in breadth and depth.
So – if you’re scratching your head and the impact doesn’t immediately make sense (perhaps you’re not a storage person) – here’s a way to think about it….
Think of it this way (making the analogies):
If you want to understand this better – read on….
Who would that be? Well EMC of course :-)
We have a VERY aggressive use of technology within our own IT shop to hit efficiency, cost savings, and flexibility goals. If you’re interested – join this webcast on Thursday!
Thursday, December 10, 2009 - 8 am PT / 11 am ET
Learn More: http://info.emc.com/mk/get/DBM5457-1923_raf_lp?reg_src=WEB
Topic: Exclusive, inside look at EMC's virtualization status—from challenges faced, lessons learned and best practices developed, to future plans. Leverage EMC's own project to transform your data center. Paul Divittorio, Director of IT Enterprise Systems and Application Hosting Architecture at EMC, hosts this technical webcast and Q&A session.
Paul Divitorrio is great, frank, and blunt. How we use all the stuff we talk about is a fascinating story. It’s not all sunshine, rainbows and unicorns – there’s good/bad/learning. The good thing is EMC IT consumes products about 6 months prior to GA. They help find and fix issues, and also and influence product direction.
We’re undertaking our large-scale View 4 rollout now (have been doing a VDI pilot for a while), so that’s a relatively new thing, along with Vblock deployment. Ask him about our mission-critical apps on vSphere 4!
UPDATED Dec 8th, 2:45pm EST – clarification on VCE support model.
Well – it’s been a busy month since the VMware, Cisco, EMC Virtual Compute Environment coalition launch. Customer reaction has been universally positive in my view, with the only critique being pressure for more details.
If you’re a VMware, Cisco, or EMC employee, or a VCE partner – the official Required/Recommended Bill of Materials (BoM) for the Type 1 (mid-range unified design) and Type 2 (enterprise NAS + block scale-out design) have been published. You can get the BoM by using the VCE portal (www.vceportal.com) and registering the opportunity, or by directly reaching out to the VCE Solutions Support Team or SST (which part of my team is part of) – though if you use the VCE portal, the SST gets engaged automatically.
Having now been working with customers on this for some time, and our first set of Vblock sales under our belt and rapidly accelerating, I wanted to share a couple of thoughts.
There are a couple others I get often – these I will answer immediately. The first two I will do in the body of the post as they are longer technical answers.
I’m sure that we’ll see more and more “hey, I’ve got a Vblock for you right here” from all sorts of places soon. The first rapid shot across the bow came from HP – but the idea of integrated infrastructure is an obvious one with obvious benefit for the customer – so “fast follower” motions are expected. To me, the think that makes this really interesting – beyond all the clear examples that we can give of CURRENT and FUTURE integration between our stacks - is the integrated management model, selling model, and support model. Otherwise it’s just a collection of parts.
There’s a lot of pressure from other non-Cisco parters towards EMC to create “V_E” Vblocks, and I’m sure Cisco is getting the same on their side. Remember, VMware (most of all!!), Cisco and EMC will continue to partner openly – after all we know there will be need for the “a la carte” option as well as the “prix fixe” best of breed model we’re proposing.
If you’re interested in the detailed technical answers to questions 1 and 2 (inlcuding a UIM demo!), I’m certainly interested in your feedback so please read on!
I mentioned that we had found an issue in my vSphere 4 update 1 post (not the same one related to APD path state discussed here – please read that post – seems to be affecting a fair number of customers), and there’s now been enough research to validate the bug, so I want to explain it in more detail.
So, here’s the story. Our Cork Solutions Center team (an awesome crew!) was doing some testing with CLARiiON and vSphere 4 with Exchange 2007. The Jetstress runs showed weird inconsistent results, that looked like this:
This is a Navisphere Analyzer (a CLARiiON performance tool) showing the IOps per front end interface at different user loads. Note how all the front end ports are nicely loaded in the 20K, 16K, 12K, 8K, 4K user workloads, but then are all messy for the 24K user run? What happened? Well – the ESX host was rebooted between the 4K run and the 24K user run.
A bit more background – we try to do as much testing with both vSphere 4’s native multipathing (NMP) using round robin (RR) and also the EMC PowerPath/VE vSphere vmkernel loadable multipathing plugin (some customers will use both). In the NMP RR tests, we switched the IO Operation Limit parameter to 1. This parameter controls how many IOs are sent down a given path before vSphere starts to use the next path. By default, this value is 1000.
This is what the default shows if you issue a command from the vMA (BTW – the “Use Active Unoptimized Paths” value makes RR use ALUA paths that advertise a “unoptimized” state – meaning they are on a “non owning” storage processor)
After running “esxcli nmp roundrobin setconfig --device=naa.<the device naa goes here> --iops 1 --type iops” for each device, it looks like this …
But we noticed after a reboot – the IOOperationLimit value reverted to a weird random value (in this case 1501691480)
This is what resulted in the weird behavior.
Since, this has been confirmed to also be the case in vSphere 4 update 1. So – until this is resolved, just leave the IOOperationsLimit at the default if using NMP Round Robin. Note that this doesn’t affect customers using PowerPath/VE, where the number of I/Os issued down one path/queue is adaptive based on queue depth (when using HDS or HP arrays) and is predictive (adaptive, but also using target port queue depth as a predictor of future initiator queue depth in the algorithm) when using EMC arrays.
Recently saw a little uptick (still a small number) in customers running into a specific issue – and I wanted to share the symptom and resolution. Common behavior:
Examples of the error messages include:
“NMP: nmp_DeviceAttemptFailover: Retry world failover device "naa._______________" - failed to issue command due to Not found (APD)”
“NMP: nmp_DeviceUpdatePathStates: Activated path "NULL" for NMP device "naa.__________________".
What a weird one… I also found that this was affecting multiple storage vendors (suggesting an ESX-side issue). You can see the VMTN thread on this here.
So, I did some digging, following up on the VMware and EMC case numbers myself.
Here’s what’s happening, and the workaround options:
When a LUN supporting a datastore becomes unavailable, the NMP stack in vSphere 4 attempts failover paths, and if no paths are available, an APD (All Paths Dead) state is assumed for that device (starts a different path state detection routine). If after that you do a rescan, periodically VMs on that ESX host will lose network connectivity and become non-responsive.
This is a bug, and a known bug.
What was commonly happening in these cases was that the customer was changing LUN masking or zoning in the array or in the fabric, removing it from all the ESX hosts before removing the datastore and the LUN in the VI client. It is notable that this could also be triggered by anything making the LUN inaccessible to the ESX host – intentional, outage, or accidental.
Workaround 1 (the better workaround IMO)
This workaround falls under “operational excellence”. The sequence of operations here is important – the issue only occurs if the LUN is removed while the datastore and disk device are expected by the ESX host. The correct sequence for removing a LUN backing a datastore.
Workaround 2 (only available in ESX/ESXi 4 u1)
This workaround is available only in update 1, and changes what the vmkernel does when it detects this APD state for a storage device, basically just immediately failing to open a datastore volume if the device’s state is APD. Since it’s an advanced parameter change – I wouldn’t make this change unless instructed by VMware support.
esxcfg-advcfg -s 1 /VMFS3/FailVolumeOpenIfAPD
Q: Does this happen if you’re using PowerPath/VE?
A: I’m not sure – but I don’t THINK that this bug would occur for devices owned by PowerPath/VE (since it replaces the bulk of the NMP stack in those cases) – but I need to validate that. This highlights to me at least how important these little things (in this case path state detection) are in entire storage stack.
In any case, thought people would find it useful to know about this, and it is a bug being tracked for resolution. Hope it helps one customer!
Thank you to a couple customers for letting me poke at their case, and to VMware Escalation Engineering!