This is the authoritative EMC guide to VNX and vSphere. Check it out here:
This one has updated guidance on sizing (hint: make big datastores see page 43), updated guidance on SIOC (hint: always turn it on, see page 46), updated guidance on parity (hint: use parity RAID for your “generic datastores”, and focus your efforts on specific design for the heavy-hitting mission critical apps).
Nice to see it out formally, our guidance continue to get simpler. The general principles align with Chapter 6 (Storage)of the “Mastering vSphere 4” that Scott Lowe wrote (and I guest wrote the storage chapter). Simple, big datastores + just get going is the right way – and knowing what to measure and react (since you can adapt non-disruptively) is the right way. Only spend time on focused planning for VDI (IO intensive) and mission critical apps (IO intensive and SLAs).
HINT: the next rev for vSphere.ahem is well underway.
Chad, I read these techbooks on the regular, and have several questions about this one.
1. Page 41 references that Pool LUNs have a performance impact compared to traditional RAID Group LUNs-- 10% for thick, and 50% for Thin? Is that right?
2. Page 53 discusses the new default Failover Mode (4- ALUA) but from experience, the vSphere hosts do not register their initiators as Failover Mode 4-- has this been changed/updated? Seemed to be more on the VMware side. My blog post explaining what I witnessed on this http://blogs.egroup-us.com/?p=4009
3. Also on page 53, the best practice NMP option is FIXED_AP, whereas in the past, Round Robin was the best choice to make for ALUA. Any issues running Round Robin or reasons why AP is better?
4. How does SIOC "play" with FAST VP? There are some recommendations for the proper latency value on page 69, but knowing that FAST VP may have 1GB slices moving up or down storage tiers, how does one determine "best" latency to set?
Looking forward to vSphere.ahem!
Thanks!
Posted by: Steve Rattacasa | June 10, 2011 at 12:02 PM
Thanks Steve, and thanks for being a great EMC partner.
1. It's not right :-) I'll get them to fix it. That said, there IS a performance delta between traditional LUNs, an thick pool LUNs and thin (aka mapped) pool LUNs. We tend to point out the worst case (being very pessimistic). The worst case is:
- on very small platforms where they are CPU bound (think older gen CX4-120)
- very random, small block, write-oriented workloads (which can saturate the CMI)
- where there is no FAST Cache in the system
In those WORST case scenarios, the differences can be distinct.
2. You are right/wrong - the default for ESX hosts is still failover mode 1, but the best practice is to switch to failover mode 4 across the board. Working to make that easier.
3. We've seen an uptick in the number of customers who have experienced a lot of tresspasses during certain conditions when using NMP RR. Until we nail it down 110%, FIXED_AP is "safer". These issues have been rare, but again, there is a conservative tendency that runs through our bones.
4. SIOC works great with FAST VP, and has been tested and validated. It is 100% supported. The core thing to understand is that SIOC will change the way the underlying storage subsystem will "see" IO patterns - but it is a reflection of administrative desire. The BP for SIOC's latency threshold on auto-tiering subsystems is the median between the highest and lowest tier in the pool (which for a SSD/SAS/SATA pool happens to land you back at the default). A WP on this topic is being worked on.
Again - thanks for being a great EMC and VMware Partner at eGroup!
Posted by: Chad Sakac | June 13, 2011 at 08:57 AM
Hi Chad,
Is there a VNXe version of the document?
Many thanks
Mark
Posted by: Mark Burgess | June 20, 2011 at 02:29 AM
Just in case you were wondering as I was: there is a version 4 release from 2015 by now available https://www.emc.com/collateral/hardware/technical-documentation/h8229-vnx-vmware-tb.pdf
Enjoy!
Posted by: David | January 21, 2015 at 10:20 AM