Folks – I let the cat out of the bag a little early on the “lets dial up the IO workload on vSphere and see how it goes” exercise.
So, the results in more detail are being shared here at EMC World, and VMware has posted to the VMware performance team’s blog here.
More and more details will be published, but I wanted to give you some more detail.
The choice to do a 100% read workload was to avoid any “write cache effect” – this way we could see the effect from the guest, through the IO stack all the way to the backend disk (service time matters). We also wanted to show it with a real I/O size (we went even higher with smaller I/O sizes)
So – we didn’t use VMDirectPath IO Gen 1, but we DID use the new paravirtualized SCSI adapter. The first step (at smaller loads – in the ~100K range during the tests we did comparisons to determine what we would use as we scale up). Check out the results.
The other thing that was insane (at least to me) was the guest latency. Through the VMware stack, there was only a fractional of a ms of latency from the observed ESXtop latency. Admittedly, the EFDs have an insanely fast service time, which helps a lot here.
Note that we also wanted to respond to the original 100K test (which used 16 VMs on 16 VMFS datastores) where people wondered if the scale-out was related to the result. In this case – the only reason we had 3 VMs and 3 datastores was that to hit the 365K point, we needed 3 CX4-960s (kinda funny, we showed that one CX4-960 could do the work that formerly needed 3 CX3-80s but then vSphere drove 3 times more work :-)
We’re working on the next “up to plate” set of tests – looking like we want to do some bandwidth-limited use cases.