No, I'm not talking about my Toronto Mayor Rob Ford (which is unbelievable, ridiculous, but not awesome), but rather something else.
While we're super excited about XtremIO (and judging from the competitors' response - they are excited too - perhaps in a less positive way)…. If you want to understand why, in this post, I said in this:
"That the potential biggest disruption (even more than All Flash Arrays or “AFA”) is in the world of server flash – not so much on the hardware side (see point #1), but more on the opportunity for new “mashups” (sometimes called “hyper-converged”) architectures where internal storage in the server is shared/distributed. This architectural model is fundamentally enabled by server-based flash (for low latency transactional workloads as a cache/buffer/tier) and 10GbE. Think of VMware VSAN, EMC ScaleIO, Nutanix, Simplivity and others as early examples of more and more to come."
… well, if you want to understand, look at these three recent steps (which just occurred over the last couple weeks)...
Step 1:
- An EMC SE (Matt Cowger - @mcowger) decides to take up the "scales to hundreds of nodes" claim of ScaleIO and as a skeptic, wants to try it.
- He only has enough hosts in his EMC labs to drive maybe 50 hosts, so he decides to try Amazon Web Services EC2 instances.
- He builds some quick and dirty tools and fired up 200 EC2 t1.micro (the smallest ones) instances and some EBS volumes - writes some tools, and BAM - drives more than 10Gbps of aggregate performance - far, far more than you could do on EBS alone.
- The crazy net is that he does the whole workload test for $7.41 :-)
- Like any great group of engineers would - the ScaleIO engineers were not satisfied (Matt ran into issues/strangeness at the 200 node count). When engineers are unsatisfied, they want to solve problems….
- …So they tackle the problem. They take Matt's code to drive the load - and tweaked little bits based on their expertise and BAM - increases the node count to 1000 m1.medium EC2 instances, subdivided into 5 protection domains, 400 clients (yikes that's scale!), and drives almost 1,000,000 IOps, and almost 30Gbps worth of bandwidth.
- Again, if you tried to do that on EBS alone, it would never get to that scale of performance.
Step 3:
- Matt realizes that there's another test to do… and doesn't want to be outdone :-)
- In the real world, ScaleIO tends to get deployed on more beefy hardware than t1.micro instances (which are teensy)… So…
- Matt takes his tools, and tries 10 h1.4xlarge EC2 instances. These instances don't use EBS, but use local SSDs, and can be interconnected on a local 10GbE switch, and BAM - drives almost 2,000,000 IOps - and THIS IS FREAKY - an average of 650 microseconds of client latency per IO (though the standard deviation was high). BTW - this echoes what we typically see in ScaleIO physical environments - so it's not an artifact of the fact that we used AWS for a fast and dirty way to test.
One thing I will note is that you don't HAVE to follow the python-based install process.
In fact, if you look carefully at the code on my github, you'll see its pretty easy (because my code doesn't use the python method either).
1) Install MDM (the manager) on a node. 2 command lines give it a protection domain and a license key.
2) Install the SDS (the storage node) on some number of nodes. For each node you add, run a single command on the MDM to register it.
3) Install the SDC (the client) on some number of nodes (could be the same nodes as the storage if you like), and point them at the MDM (single command).
Thats all it takes to install, even without the python script. So, its pretty easy to automate. In Big O notation terms, we'd call it O(N).
Posted by: Mcowger | November 19, 2013 at 07:14 PM
Step 4:
Take six servers stuffed full of PCIe flash and put them on a 40Gb network.
Add ScaleIO.
Deploy 1000 virtual desktops in 1 hour and 28 minutes.
Boot all 1000 in 12 minutes.
...and so on.
Freaky elasticity like this belongs in an episode of the X-Files. Or something similar...
Posted by: David Nicholson | December 02, 2013 at 01:11 PM
Chad, that original article is offline but I found it in the wayback machine:
https://web.archive.org/web/20140919083108/http://blog.cowger.us/?p=440
Posted by: Mutant_Tractor | August 06, 2015 at 06:17 AM