There were two pieces of Isilon news today at EMC World – both big, but one being REALLY BIG.
First – the simple news… Isilon is releasing the update to their 4U node – the new X400 node. This represents a new record-breaker when it comes to Isilon scale and performance in the center of the Isilon use cases. Bigger, faster, stronger.
The bigger news (IMO) was the outing of Mavericks – Isilon’s upcoming next major software update. Why is it big news?
The architecture of Isilon makes it ideal for extreme scale – Big Data NAS. Read on for more after the break!
Put it this way…
if you need 10’s or 100s of TB of NAS – you have lots of choices. Often in these categories, being a “swiss army knife” is needed – including great block performance. VNX is very popular in this category (growing great!), and very, very cost effective, high performance, and feature rich. Beyond the “general purpose shared storage” – VNX is a an IOps/MBps machine - the best choice EMC offers for the extreme HPC use cases – often backing Lustre.
Conversely if you’re using, or thinking you might use (someday) PB-levels of NAS scale, there are fewer choices. At that scale – core scaling behavior is the dominant customer requirement – and when you need NAS at large scale – there’s nothing like Isilon. Its win rate when it’s a big data use case (genomics, media, etc) is enormous (and is growing at 100% constantly. It will be a billion-dollar part of EMC’s business soon.
BUT (and there is always a but) – there are use cases where Isilon’s architecture switches from strength to weakness – or at least hard engineering problems:
- The fact that it spreads IO across many nodes (resulting in HUGE bandwidth, and simple scaling = good) means that the time to service a single random write IO is lower (bad) – a natural effect of a truly distributed model.
- Similarly, distributed filesystem mechanics (as opposed to layering indirection on top of “classic” tightly coupled filesystem models) means that multi-threaded writes are a harder problem to solve (this makes sense: a file is “owned” by many nodes at once with Isilon, and conversely with VNX or NetApp configurations – both c-mode and 7-mode – a filesystem/file/inodes are controlled by a “single brain” at a time). Distributed problems can be hard to solve.
- Snapshots are also not the “primary design center” for Isilon – mostly because it wasn’t a main driver in the use cases they were servicing initially and also because, once again, there’s the distributed nature of the redirection model needed.
These 3 things (and some others) are requirements for “classic” Enterprise IT use cases of NAS. Examples are that Oracle on NFS, and VMware on NFS need relatively fast small block random write IO. They are highly dependent on multiple writers against a single file. They need file-level snapshots. No surprise however, that many customers who choose Isilon for a use case that demands Isilon’s “Big Data” scaling model want to “sprinkle on” some of these enterprise use cases and not need “traditional Enterprise NAS” if they could get away with it.
That’s why Mavericks is BIG NEWS. It’s the Isilon software release that has Isilon starting to blend Big Data and Enterprise use cases, and become Enterprise Scale-Out NAS.
It’s interesting - when we were evaluating whether to try to “bolt on” scale-out to VNX (hard!!! and would distract the VNX team from their design point – simplicity, performance, efficiency), and started to look at Isilon as a possibility to add to the portfolio – it was the fact that they were already looking at bridging these big architectural issues that was critical.
Think:
- 50% lower latency on “enterprise IT” use cases through Endurant Cache. This changes the behavior of the IO destage from a synchronous write model (don’t ack to host until it’s written on all nodes determined by the protection model) to one where the IO is protected by mirroring to log files on two nodes – and then writing to the whole cluster using the Forward Error Correction (FEC) using the Reed-Solomon approach. The distributed write is why scales so well – but was also traditionally why random write IOs were slow relative to “traditional NAS” models. This change in the IO path means the goodness of scale (distributing across nodes) is not in the latency path (time it takes to protect and ack the write). That’s big. It also means…
- Isilon can now do multi-threaded IO for the “big file” use cases of VMDKs, large Database files, iSCSI LUNs, HPC checkpoint files
- They’re adding rich file-level snapshots – which also lays the groundwork for other rich data services that depend on block level indirection (those of you close to this can read between the lines to what that means).
- Support for the VAAI and VASA stuff vSphere customers expect if they are going to use NFS datastores.
- Dramatically improved protection via NAS replication with SyncIQ- which now drives way more bandwidth per node.
So… VNX or Isilon? It’s really simple at it’s core:
- Chose Isilon if you would have chosen Isilon before (ergo you need NAS that scales like a mofo and don’t really care too much about block), but now you can to “sprinkle on” some IOps, lower latency workloads along with capacity-gated and bandwidth gated workloads.
- Chose VNX if you aren’t struggling (yet?) with the scale limits of “traditional” unified storage models. VNX is the crème-de-la-crème of the “swiss army knife”, and is a feature rich IOps/MBps machine. It has a TON of integration with Oracle, Microsoft and VMware. If you don’t need a big NAS filesystem (like bigger than 16TB) it’s a no-brainer.
Why “Mavericks” as the code name? The biggest, best place for surfing BIG waves – makes all the sense in the world to me :-)
Hi Chad,
This looks very interesting but can you quantify the relative single stream read/write performance for latency and throughput for Isilon and VNX.
My understanding, even with these new updates, is that there is still a huge difference and we are way off using Isilon for anything but tier 2 and 3 applications.
I concluded that with the previous version unless your big data was made up of very big files (video, gene sequencing) Isilon was not a good fit - what additional use cases will this support (i.e. file sharing).
Many thanks
Mark
Posted by: Mark Burgess | May 22, 2012 at 08:09 AM