I’ll keep saying it – I think the potentially most disruptive technology in the whole EMC Federation when it comes to the world of “persistence” is ScaleIO. We are busily disrupting ourselves on many fronts:
- There are some technologies that are disruptive because they enable new use cases around analytics and HDFS (think of ViPR HDFS/Object or DSSD)
- There are some technologies that are disruptive because they change the way people pool, abstract and automate persistence layers of all types (NAS, SAN, Object, HDFS, VM storage) in IaaS models, and do it across all use cases - physical, VMware, Openstack and more (Think of the ViPR controller).
- There are some technologies that are disruptive because they enable massive “pools” for enterprises filled with unstructured data of all types including oceans of NAS (Isilon).
- There are some technologies that are disruptive because they enable operational model of storage (VSAN)
- There are some technologies that are disruptive because they leverage flash to the fullest (XtremIO)
- … And of course, we have to keep innovating around the “best swiss army knife” for customers who need one thing to do many things moderately well across broad bands (VNX), and support the most mission critical workloads with very specific replication, host types, scale, and blended workloads (VMAX).
But… All that said – IMO the most disruptive of all things in the world of “persistence” is ScaleIO. Why? Well – it can disrupt huge swaths of the whole “persistence ecosystem” (i.e. our own profit pool, and also that of others). Why? Here’s why:
- It is open. So – if you love VMware, and see the disruptive impact of VSAN, you “get it”. Multiply it by all the stuff going on outside VMware, and all the momentum of Openstack – and bam, ScaleIO is hugely disruptive. Some customers LOVE “coupled to vmkernel/vCenter and can only store VMs” model of VSAN. Some hate it, and love the “open” model of ScaleIO. Choice is good! :-)
- It’s very flexible and elastic. Depending on the host configuration – you get everything from “blended storage using public cloud players for mobility/elasticity” to “IOps powerhouse” to “capacity workhorse”.
- It’s surprisingly robust already. Snapshots, QoS, and a lot more. Replication is coming, and with ViPR 2.0, it’s integrated into the ViPR controller for management and programmability (which means it bolts into a ton).
There’s a back and forth I’m talking about with Ceph and ViPR with Mauricio Rojas which may make for interesting reading here. I still stick to my guns (market chooses whether I’m right or wrong!) that the design point of Ceph originally is well-suited for “Type 4” use cases – and where the “layer on transactional” doesn’t scale as well as something designed for this purpose.
I missed this in March, but ESG did a study on ScaleIO. While I’m sure we asked ESG to do the study (I don’t know, but am guessing), in my experience, they do a good job of putting things through their paces, and really finding good and bad.
This was a series of configurations based on Cisco C-Series servers, and different workloads.
The results posted were huge (and echo the “this is really disruptive” observation).
- Think 1M IOps from an 8 node cluster, and 11M IOps from a 53 node cluster.
- Think of 100’s of GBps (yes, GB not Gb) of system bandwidth.
- Think of rebalancing across clusters as nodes are added and removed that happen immediately and take minutes for huge volumes of data.
They posted a summary video (click on the diagram below) or a longer written write up here.
For what it’s worth – I think if people are not looking at software-based data services stacks (aka “SDS Data Planes”), and starting to use them (to harden them, build experience) – they are missing out.
The data service implementations that are the most impactful here are the “Type 3” (loosely coupled scale-out) and “Type 4” (shared nothing distributed) – because of their complete hardware independence, scaling and protection methodologies. There are “Type 1’s” that are software-only, but generally those are used in narrow bands (because their clustering is in pairs with narrower hardware restrictions).
On top of these “base layers” of “Type 3” and “Type 4” – you can expect us to keep up making things that provide cool use cases like Project Liberty (Software-only VNX), expanding VPLEX/VE (software only VPLEX), and Project Mercury (Recoverpoint that is 100% software only) – whcih I opened up for beta here
There are choices in the industry, from inside the EMC Federation (VSAN, ViPR including ScaleIO, HDFS and Object), and outside (like Redhat) – but start putting the shivers into your hardware-centric vendors, start to use these new SDS Data Services stacks.
Better yet, pick your way to abstract those software data services – either in a use-case centric way like VSAN (VMware only, VM images only), or Cinder with Openstack (Openstack only, Nova volumes only) – or in an open way like the ViPR Controller (VMware, Openstack, Microsoft, Physical, and for VM images, Nova volumes, general purpose NAS, HDFS, Object – whatever).
It’s a changing world!!!!
Comments
You can follow this conversation by subscribing to the comment feed for this post.