[updated Sept 2nd 9:17pm – fixed links]
This session was the joint NetApp and EMC around Software Defined Storage – what are we doing, how are we adapting and evolving?
While Vaughn and I agree on far more than we disagree on…. There were two things in this where I want to point out where Vaughn and I diverge:
- While there are superficial similarities between NetApp vSeries and EMC ViPR architecturally (they both sit “in front” of storage and can provide a common API to a broader set of storage), and we demonstrated similar Openstack/vCAC workflows, there are a TON of very material differences:
- While I feel NetApp and EMC agree on the need to support broad vendor hardware in this new abstracted model, the EMC view is that the control plane needs to be PURE software, not require a hardware platform. EMC ViPR is pure software (simple, but also cloud-scale distributed vApp with a cassandra distributed metadata model). vSeries requires NetApp hardware. Some will point to ONTAP Edge as an example of software only (and it’s A-OK), but it’s non HA instrinsically (at least without a lot of other engineering). Not saying right/wrong – customers will decide.
- If the first step of “fronting” an data plane (sw/hw appliance or software + commodity) is “erase all the storage on the existing thing”, and the second step is “hand it all over to us so we can transmogrify it”, and the third step is “now the core architecture of the data plane is the front thing” - that is traditional storage virtualization (which has a place)- which is “storage data plane overlay”, not control plane abstraction. Our EMC view is that while MANY customers can do it all on a single “swiss army knife” (for us, that’s EMC VNX), the reason there is architectural diversity is that NOT ALL WORKLOADS ARE THE SAME. Therefore, different architectures will continue to propagate. THEREFORE (perhaps most importantly, the control plane abstraction for storage must pool, abstract, automate, virtualize – without homogenizing). That’s how EMC ViPR works. NetApp vSeries in essence just uses external storage as “spindles”. There IS a place for the “classic storage virtualization” around using older resources, making data migrations and platform refreshes simpler (EMC does it with VMAX, and adds “active-active” with VPLEX) – but it’s not a “control plane” that is decoupled from “data planes”.
- Interestingly, Vaughn said that he didn’t think there was a place in the enterprise for distributed software transactional storage stacks. Subsequently on Twitter, he explained that he thinks this is due to the inherent need to maintain multiple copies across nodes, and the relative immaturity of these ways of building storage data planes. I don’t agree with this. While Vaughn is (IMO) right that distributed storage stacks like VSAN are:
- less efficient from a capacity perspective than shared storage (they must treat a server node as a “failure unit”, and generally use more copies of data)
- today have this strange “if the data is local great, if not, must transit to get it if not locally cached” latency curve
- will likely need maturing time to fully solidify and broaden use cases
- will not be as dense as purpose built platforms until server vendors start to rearchitect
- will be great if CPU/Memory needs scale linearly with storage with your workload (if not, will not have a good TCO)
…In spite of all that, I think distributed will prove very popular in SMB, SME, and in various use cases (that will start limited, and expand over time). When customers are all VMware, they will dig VSAN. There are less obvious, but still tangible “hey I control it all” benefits from the vMware admin standpoint. When they have multiple hypervisors (including vSphere), physical hosts, they will dig EMC ScaleIO. But, there’s no question in my mind that this new architecture will compete with “external shared storage”. The pendulum was all the way at DAS before shared SAN/NAS, and I think it’s swinging back the other way now. No one knows how fast, or whether it will steady in the “middle”, or continue on back to overwhelming use of local storage. BTW – this new architecture emerging is, IMO – good for customers. The only people who should be worried are those that don’t embrace it. Speaking for us, EMC is absolutely embracing this model, while ALSO continuing to work to make sure our primary storage shared storage platforms (Isilon, VMAX, VNX) remain the best choice in their sweet sports.
BTW – and I want to reinforce this – that’s MY opinion. Vaughn may turn out to be right. It’s ultimately up to the customer to decide.
Topics in this session.
- “Yesterday” (still important, but not the center of the action – do the fundamentals right)
- “Today” (focus on pool, abstract, and automate).
- “Tomorrow” (evolving world of sea of storage and data lakes, maturing software-only distributed stacks, vVols and more!)
Click on the below to download!
To download the version with the embedded demos – WARNING – LARGE 200MB download – click here.
Enjoy, and share!
Comments
You can follow this conversation by subscribing to the comment feed for this post.