[UPDATED March 13th, 9:45pm ET] – a little clarification in the XtremIO Content Addressing bit….
Last week, EMC had a major Flash.Next product launch – a barrage of stuff:
- XtremSF (Server Flash) is a series of PCIe-based server Flash cards (both PCIe card for any rack-mount server which you can get from EMC or EMC partners, and also a Cisco UCS Mezzanine card for B-series blades, which is acquired through Cisco and Cisco partners). These come in a variety of sizes and types (eMLC and SLC). The ones I expect to be most popular are the new larger eMLC cards, as well as the ones in the ~500GB capacities that are a great bang for the buck.
- XtremSW (software family) software that can be coupled to the cards – this is a rebranding of the VFCache software, but also there will be more things in this family (we’ve telegraphed distributed cache, active/active cluster support, native DRS, array-awareness for tagging and promotion of info all as examples of things customers should expect)
- XtremIO (All Flash Array) directed availability. XtremIO arrays are now leaving manufacturing and shipping to customers. This DA period is a “warm up” for processes and systems. It also lets the XtremIO team provide extra care and caution that our customers expect from a new persistent IO stack.
People have really picked up on the fact that, yes indeed – Flash is very disruptive to the storage market at large. Read on past the break for more details, and also my 2 cents on how everyone in the industry better be scrambling.
We had a customer and a great partner join us for Chad’s World Episode 17 where they shared their experiences with XtremSF (and we have a little fun with it)…
Now that the silliness is out of the way (though I’ll save another silly thing for the end), think of these fundamental engineering effects – each of which is a big disruption, when put together are massive:
- PCIe-based flash on a server means the IO never hits a shared array – in effect “subtracting” from the amount of IOPs arrays from EMC and others need to “sink”. This isn’t going to happen overnight, and isn’t going to replace shared storage (why, oh why do humans always want things to be black and white :-). But, it already is slowing (a tiny, tiny amount) the growth in “shared storage IOps sinks” that we would see otherwise.
- TODAY we are in the era where the sweet spot for most customers and most use cases are hybrid arrays that use a lot of magnetic media and a relatively small amount of Flash and a tiny amount of DRAM. This is intrinsic, not marketing or positioning. It’s a result of the economics, the global Flash memory supply (the majority of which goes to iPhones, iPads, Samsung Galaxy devices), and the nature of general purpose information (lots of pretty cold stuff out there which is a fit for magnetic media). This nets out to this:
For most customers that need something that does it all – a VNX with Flash is a great choice, and if they are at larger scale (where management at scale dominates), a VMAX coupled with Isilon is a great choice. In either case (“something does it all” VNX or VMAX/Isilon) – adding a little well-placed server PCIe Flash in selective use cases is right. BUT even today, there are some use cases (VDI, hyper-transactional systems that aren’t in memory, OLTP that is cache hostile, test and dev farms) where all-flash arrays can just flat out crush traditional architectures.
TOMORROW is also interesting – Flash prices will continue to drop (broadening applicability), and eventually (many years from now), something like Phase Change Memory will also come into the market as a new higher tier.
- While there is a trend toward using high-performance transactional (small IO size, low latency) NAS for some use cases (think VMware on NFS, Oracle dNFS), the All-Flash Array startups are block arrays. Why? Even very performance transactional NAS systems (think of EMC VNX, NetApp FAS, Nexenta and the like) have much longer “code paths” than the all-flash block arrays. For a transactional NAS system, getting into the 10-1ms IO latency band is really good, and doing it at a low cost per IO ($/IOps) is even better (as an example, Isilon with Mavericks aka OneFS 7.0 got into that low latency band, but is still much higher $/IOps than VNX – which means “sprinkling on some transactional load on Isilon that you buy for normal Isilon use cases” is a good idea, but I wouldn’t call Isilon “transactional NAS”). BUT the bar to enter into the all-flash array market is well below 1ms. Again, pointing out that XtremIO (and most of the all-flash arrays) serve write and read IOs almost always under 500 uS vs. 10-1ms may seem, well “extreme” and a perhaps a ridiculous fine-grained difference. After all, humans don’t tend to think in microseconds (uS) :-) BUT when stated as “would you want something at least 2 times faster?” the answer is always “yes”, at least for SOME workloads. I’m sure some folks will bristle (and perhaps comment!) at this comment – and I’m not trying to be hyperbolic and claim the “death of transactional NAS” – rather that at this point the performance bands of all-flash arrays and NAS are, well, orthogonal. I have a great deal of respect for NetApp – I believe that the above is why their early all-flash array efforts have mostly centered around Engenio rather than ONTAP, and just like us, make their general-purpose array leverage some Flash and a lot of magnetic media the best they can.
- Today, use of server-based PCIe Flash is very use case driven, and has lots of dependencies on OS, app, etc. With things in “future release” of vSphere like vFlash where server-based flash can be be shared and abstracted to generic VMs, I expect use to broaden widely. You can count on the fact that this is an area of a lot of collaborative work between the XtremSF/SW team and VMware.
There’s a lot more than just these 4 bullets, but they provide some of the insider thinking that grounds me in my belief that everyone better be scrambling, innovating, and looking at organic (R&D) and in-organic (investment/acquisitions) activities.
If you look at last week through that lens, you can see that EMC is investing furiously down all the paths:
- Organic efforts with XtremSF/SW and VMware working furiously on vFlash and EMC cooperative work to integrate. To be honest, I certainly expected the PCIe Flash hardware to commoditize faster – but this does indeed seem to be starting to happen now.
- Organic efforts with R&D like Project Thunder. Thunder was an advanced R&D project to look at “shared PCIe Flash” platforms – connecting via RDMA over Infiniband or Ethernet. Some people noted that Thunder was absent from the launch last week. The reason isn’t esoteric, but pretty basic. We talked to a lot of customers – and they were pretty clear. If they want raw performance, the delta between the 10-50 uS of a XtremSF, the 200-300 uS of Thunder (shared, but relatively low on “features”) and 300-500 uS of XtremIO (shared, feature rich) left relatively narrow use cases for Thunder. As an example where it does resonate is in some HPC use cases. But – other HPC use cases are best served with XtremSF. In general customers told us “if ever uS matters – I’m going XtremSF. If I want it shared, the extra 100 uS of XtremIO when compared with Project Thunder is worth it for all the rich features I get from a full-blown array”… Ok, great! Thunder IP will be use in the EMC portfolio in a couple places.
- Organic efforts to make sure we furiously work to make our Hybrid Arrays (VNX, VMAX, Isilon) leverage a small amount of Flash to the fullest. This a different battleground than the “All-Flash Array” market, and has other newcomers (like Nimble). The name of the game here is to figure out how to mix Flash and magnetic media to hit a broad sweet spot in the market. Customer feedback on FAST Cache and FAST VP is overwhelmingly positive. Expect that in future software and hardware platform releases, we’ll continue to expand on this capability, and over time, we need to ensure the arrays can actually drive more and more flash as a percentage of persistent storage. HINT HINT.
- Inorganic investment, and then organic efforts with XtremIO. Now that we’re actually shipping, I want to call out something I don’t think it was fair to point out before (I don’t believe in talking smack in general, and you have no right when you aren’t shipping). EMC (and certainly me) look at all the startup folks furiously. There is a very healthy paranoid and respect for the fact that startups can innovate and move faster than the big. We’ve been involved in this industry transition for the last 5 years – including looking at all the All-Flash Startups. I’ll put it bluntly now:
I really think that from a core engineering standpoint, XtremIO is the pick of the litter of all the All-Flash startups.
Now, I’m sure that I’m biased, but here’s my rationale (and was the core acquisition rationale):
- XtremIO has taken a radical design departure. They are actually a content-addressed storage system internally (update – I want to make this VERY clear – XtremIO is not “CAS” Externally – it uses hash values to uniquely store info, and to determine placement, but access is the way normal arrays work). What do I mean? They create a unique hash (long story on how) for every incoming IO. If this is non-unique, they don’t write it. That’s why you can’t “turn off” the inline-dedupe. It’s intrinsic. This also increases bandwidth, makes all sorts of things like snapshots and VAAI operations just amazingly effective, and more.
- XtremIO was built for scale-out from the get go. People are starting to get that scale-out isn’t a feature, it’s a core design attribute. It’s a REALLY hard problem to add “after the fact” from an engineering standpoint, and “layered on top” there’s all sorts of weird artifacts from the underlying “non-scale-out” architecture that tends to surface. The hash-function on every IO is used to address and place the data itself within and across X-Bricks. This hashing function and hash table is itself distributed which is part of the magic. You can see immediately what this does – it means that you never think about “layout”, or “distribution” of data (a lot like Isilon, as another example). There’s something to note about this – I don’t believe that any of the other All-Flash Array startups have this as a core design principle (they are mostly “two brain clusters”), but some of the “converged” players (Nutanix/Simplivity) do.
- XtremIO doesn’t have intrinsic hardware dependencies. I think the other All-Flash Array startups (with the exception of Violin) all run on commodity hardware, so not claiming this is radically different, but rather simply very important. Yes, XtremIO uses IB as their X-Brick interconnect, but that isn’t an intrinsic hardware dependency. Within EMC, this idea of “can you run as software” is a “sacred principle” these days – to ensure that we’re not building in hardware dependencies, but rather taking them out. This means we can “refactor” the IP into different forms, at the same time that where we can innovate around the hardware, we do.
- It works. It really does :-) XtremIO comes from a strong, small, focused engineering team that take great pride in their product.
Look, it’s still early days for XtremIO – and this is still just “directed availability” (GA comes later). That said, you can see why we’re pursuing all 3 paths (PCIe Server Flash + Software, maximize use of Flash in Hybrid arrays for the mass market, ensure best of breed in the All-Flash Array space).
It makes for fun times! And hey, in some of these cases, EMC hasn’t been first (hence references to “Brand F” in the XtremSF launch). Expect lots of zany, fun, perhaps over the top things we’ll do in the process – here’s one of the first – a “Data Crushers” (clear satire of a given TV show) episode of XtremSF vs. “Brand F”. We hooked up two servers with a workload to the engines of two Chevy Volts and used that to make for a visual comparison of XtremeSF to “Brand F”.
BTW – the aggressive competitive posture can be very good for customers – we’re being clear – we’re targeting 30% better performance than “Brand F” at a 20% lower cost.
Ultimately, customers will decide in every space (PCIe on server, Hybrid Arrays, and All-Flash Arrrays) – and I’m sure I **am** biased. Nevertheless – we live in interesting times! Are YOU using Flash (on server, in array, in all-flash array) – and if so, what is your experience? If not, why not?