UPDATE (May 22nd, 2010): At EMC World 2010, FLARE 30 was announced, which amongst many (MANY!) new features, also has some fixes – one of which fixes this underlying behavior. You can read about it at this post here.
I got a little tired of a couple lightweight (aka less technical) posts (important, but still..) – so here’s one that’s an important technical “gotta know” if you’re using the combination of EMC CLARiiON (any FLARE rev), iSCSI and vSphere.
So – there is a core issue that is not generally as well known as it should be. BTW – if you don’t want to read it (though you should if you are in that group – which many of you are) – this is being covered in one of my sessions (TA2467 – Wed 11-12) and a great session by Andy Banta and John Hall from VMware (TA3264 - Wednesday at 4-5:30pm)
If you’re not at VMworld, or want to understand this immediately (you should if you’re using CLARiiON/iSCSI/vSphere) – read on…
Lots of people are scratching their heads over this statement on page 31 of the (always excellent!) VMware iSCSI SAN Configuration Guide:
What’s that all about? And, since iSCSI multipathing with MPIO and the process starts with creating explicitly binding vmkernel NICs to the iSCSI software initiator – does this mean you can’t do it?
You can ABSOLUTELY drive simultaneous interfaces against a single target when using NMP Round Robin or PowerPath/VE and an EMC CLARiiON and the vSphere 4 software initiator. BUT there is one CLARiiON issue (this is really a bug, IMHO – and one that we’re fixing, so the the below is a workaround – but a workaround that you could leave for as long as you want – there’s not really a general downside).
Ok – here it goes….
- EMC CLARiiON records an iSCSI intiator for each iSCSI session by IQN, not by the full SID (IQN+IP address) – btw – this the the bug.
- If the same initiator is noted logging in to a target, the other initiator’s session is logged off (this is the right behavior – but if the above was fixed, they would appear as seperate initiators)
- Didn’t occur in 3.5 (only one session ever), but in vSphere when using NMP RR or EMC PowerPath where multiple sessions to a single target are used – can be an issue.
- If the iSCSI intiator tries to login twice, the first gets kicked off, then logs in, kicking off the second.
- Symptom = very slow iSCSI performance (race condition where they log off, on, and rinse, lather, repeat)
- Put iSCSI vmkernel NICs on separate subnets. This works because vSphere doesn’t route vmkernel traffic and the CLARiiON iSCSI model is one target per physical port. This means that the iSCSI initiator will only try logging on once to each target – the other one will return “network unreachable” to the vmkernel, and it doesn’t try to login (and there is no error).
Result when properly configured:
- No reduction in availability
- No reduction in performance – so long as you have 2x more target ports on the array than any single ESX host has initiator ports (i.e. a fan in ratios that less than 2:1 will not be able to saturate the host vmknics, because there will be less active target ports than initiators ports)
- ½ meshed configuration - in vCenter you should see 2x the number of paths (or more generally “number of possible paths * 1/vmknics that are bound to the iSCSI intiator”) as you have as you have iSCSI vmkernel NICs.
Sometimes a diagram helps, so look at this (note – wrong would be 8 paths for the LUN – and be very slow, right would show 4 paths in vCenter – and be nice and fast):
Hope this helps!!! Will update when the bug is fixed – but this is still in the most recent FLARE rev (FLARE 29).
See you at VMworld!