Aaron Delp (all round good dude) did a great post on VDMK alignment here. alignment (both of volumes – both in the virtual and physical domain) and VMDKs falls into the category of “operational excellence” IMO.
It’s these little things that done right let you do more with less (often more than any neato thing a vendor has to offer you).
I posted a comment to explain a little bit, and it’s long enough that it’s out of “comment” category, and more like “blog post” category.
So – here it is. If you really want to UNDERSTAND the “why” behind this topic – read on.
The purpose of alignment is to minimize extraneous internal array operations. All arrays have internal constructs that are generally a function of the RAID model (and also the filesystem alignment, and in some cases logical page table constructs in virtually provisioned models).
You want to maximize full-stripe operations, and minimize stripe crossings (where an IO which should land within a stripe spans stripes).
With all the funky stuff arrays do now (thin, snap, dedupe, compress, auto-tier), you also want to decrease the number of metadata operations caused by these "unnecessary" spanned objects (i.e where there’s more metadata update than necessary).
All the funky goodness is done via either filesystem or another (pages commonly) abstraction on TOP of the RAID abstraction. Think of a 4K NTFS IO operation in a Guest making it's way down to the array. Once it gets there, let's say the array has a 64K stripe, but a 1MB "page" used for these fancy features. Falling into two 1MB logical memory pages as an example - where statistically it's much more likely to land on a boundary if the volume is aligned on a 4K boundary.
So older Windows revs (W2K3, Windows XP, etc) and LInux - like all OSes, label their volumes (just like VMware does). This volume header offsets the beginning of the data volume - the amount of which varies. In Windows 2K8 and Windows 7, the alignment of the NTFS volume is automatically offset to align on a multiple of 4K.
WARNING: I’m not an expert on NetApp – so take this as conjecture!!
In NetApp's case, the natural "alignment values" are (I'd have to assume) would be the 4K WAFL allocation size (a filesystem attribute) and the underlying RAID stripe under the aggregate (perhaps this is the 32K value which is the mbralign behavior).
In EMC's case, the natural "alignment values" are the 8K UxFS allocation size (a filesystem attribute) and the underlying RAID stripe (64K) – so 64K volume offset is our recommendation.
If you understand this, you can understand why in VMware (or Hyper-V's or Xen's) case, you need to align the "container" (in VMware-land the datastore), AND the VMDK. If the datastore is aligned on an even multiple of 4K boundary (and ideally an even multiple of the array RAID stripe) - that means the virtual disk starts aligned. BUT, then in the GOS, it also signs the volume, starting on an offset.
The reality is that the biggest benefit starts by just aligning on a nice boundary (multiple of 4K) rather than the messy start of "right after the volume indentifier". This "multiple of 4K is the default across a lot of arrays, but the closer you get to these "natural values", the better.
I have a tendency to explain things the long way (it's not because I'm trying to be complicated, but rather that the way I learn personally is by understanding it at the low level and working my way up), so let me make this simple:
- Alignment is good – it helps you squeeze more IO efficiency out of the storage stack. How much more varies, but it can be as high as 30% more. Not being aligned is not fatal, just is not efficient. Larger arrays (more brains and cache) tend to offset the impact of misalignment a bit - BUT let's start back at the basics: Alignment is good.
- Follow your array vendor's best practice. For EMC, these are spelt out in the EMC vSphere Techbooks:
- CLARiiON: Using EMC CLARiiON Storage with VMware vSphere and VMware Infrastructure
- Celerra: Using EMC Celerra Storage with VMware vSphere and VMware Infrastructure
- Symmetrix: Using EMC Symmetrix Storage in VMware Infrastructure and vSphere Environments
- If you choose to use mbralign on an EMC array (or another), it won't **hurt** you (32K aligned better than not aligned - though 64K is better on an EMC array).
- How to do alignment "upfront"? Personally, I use diskpart for older windows hosts and template up front. I use GParted for linux.
- How to do it if it's not "upfront"? If I have a customer who is running into performance problems, has a bunch of mis-aligned VMs, and I KNOW it's about misalignment (BTW, you can actually "see this" on an EMC array - use Analyzer and look at the "stripe crossings" value) - I'm a fan of vOptimizer for doing it en-masse.
This whole post has re-engergized me to pursue a broader solution along with my buddies at VMware.
Chad, fyi I commented Aaron's post w/ my 2 cents about linux part alignment (64k/emc case), for kickstarted installations, + the gparted tip you suggested.
Posted by: drakpz | June 18, 2010 at 01:34 PM
Thanks for all the follow up! Great information!
Posted by: Aaron Delp | June 18, 2010 at 09:40 PM
Hi Chad,
In Windows 7/Windows 2008, a 100MB system reserve partition is created on the first drive, and then the "first partition" begins at the next 1M boundary. For other drives, it depends on the drive type. GPT partitions are aligned by default.
Yes, under the covers NetApp (I work for NetApp; for more info check the link on my name) has a granularity of 4K. As long as you align on a 4k boundary, that works. 64K (which is a multiple of 4K) works fine.
If you use snapdrive to create LUN on NetApp, alignment is taken care of for you regardless. If not, you could potentially have issues. Just in case, I wrote a very short PowerShell script which runs on Windows and is storage vendor agnostic so that folks can check their Disk alignment. Clearly this won't run on Linux, but if you want to check disk alignment of a Windows guest, well have at it. You can find it here: http://communities.netapp.com/docs/DOC-6175
enjoy
J
Posted by: John F. | June 21, 2010 at 07:48 PM
Good post well explained guys, my customers rarely are aware of this and when you start to explain it in it's simplest form as follows, it makes for an even more compelling case:
If the average performance improvement case is 7% which seems small and i have tested this to much higher values, then an unaligned system will equate to a performance loss of 1 disk per enclosure with EMC storage arrays.
i know that has a cleared message to my users, additionally i found that Microsoft had updated their KB in June 2009 to now recommend that Windows platforms all use the 1024 alignment which equates to GPT formatting.
Thanks again guys, keep the knowledge coming
Paul
Posted by: Paul | July 09, 2010 at 03:26 AM
Nick from NetApp...
If you're using gParted, just a head's up that it also has a 1MB partition offset options. This means, with vSphere5, that Microsoft, Linux, and VMware have all standardized on 1MB offsets.
This is good news.
Posted by: Nhowell | July 14, 2011 at 12:11 PM