« More on Oracle and VMwareincluding other must-see resources. | Main | Another performance record broken! »

March 09, 2011


Feed You can follow this conversation by subscribing to the comment feed for this post.


Hi Chad, Excellent post.


I concur, great article!


Great post Chad. I have used your diagnosis process in your third rule for years. However I have always had to "Start" somewhere where I have seen the symptom. For example if you see a network issues you start at that step and go up or down the chain based on the information you have discovered. Also if you can't find a resolution to your problem you need to re-review all of your steps yourself or with a second pair of eyes. I have had cases where I go through the whole process don't find the problem and by re-reviewing everything with another pair of eyes I found I missed something...we are all human :)

lei zhang

Great, make me think more


Thanks Chad, useful post!

Domenico Viggiani

Does problem can arise with NS480 unified array (Flare

James Baldwin

Hi Chad
James B here.

The delayed ack (as a result of Nagle's algorithm) is something which affects a lot of environments, sometimes does not cause enough obvious issues as to warrant a deep investigation. People tend to live with it and accept "well, thats the best I can get".
Which is a pity.

While, it is true in VMWare and on the array target side, the same applies for Windows hosts (VMWare with in-VM iSCSI initators, Hyper-V Servers with iSCSI).

Standard IP network traffic, for the most part do not fall into the same realm in terms of payload (its much bigger - typically over 1492 bytes) as storage-based Block IP.

Quite often, SCSI commands are very small in terms of payload size, (could be just 10 bytyes) for slow-path CDB, inquiry, metadata and control commands. These are exactly the commands you do not want to be slow!
Storage vendors such as EMC, and OS/Hypervisor vendors use small SCSI commands to control and inquire storage. Typically the code written here is sequential single-threaded code to enable correct timing and arbitration of devices - and this is when Delayed Ack really stings. Like your customer, above with his sequential workload.

The default for iSCSI initators in Windows 2008 onwards is to enable Nagle's Algorithm, which I believe is the wrong thing to do. iSCSI, by its nature assumes a lower-hop point-to-point network between host and storage, and as such link saturation is less likely to occur due to aggregate traffic on trunks.

After my issues documented here;

I tried to convince Microsoft to either change the default for an iSCSI adapter or at least enable a radio button on the iSCSI Initator UI in Windows 2008. Well, I got to write them a KB Article instead :-)

So, more times than not, it can be better to have delayed ack disabled by default for iSCSI adapters.....


Justin Hensley

Much could be said about this, but I'll sum it up by saying that this is one of the top 5 best articles I've ever read regarding VMware/Storage on the internet ever.


What do you mean by enable flow control end-to-end?

The comments to this entry are closed.

  • BlogWithIntegrity.com


  • The opinions expressed here are my personal opinions. Content published here is not read or approved in advance by Dell Technologies and does not necessarily reflect the views and opinions of Dell Technologies or any part of Dell Technologies. This is my blog, it is not an Dell Technologies blog.