[UPDATE – Sept 10th, 8:54ET – many have asked for the softcopy deck that has some of these taxonomies – I’ve posted it here]
Before I go much farther in talking about EVO:RAIL, I want to quickly make a black and white statement – based on what I expect analysts/press misunderstanding today’s announcements (we’ll see if they do this): VMware is NOT getting into the hardware business :-)
Looking at the PR, and even the way EVO:RAIL is positioned as a “product” (to me, it is a OEM program, and a VMware software product - EVO:RAIL manager - that helps OEMs build hyper-converged appliance products) I can see why people may be confused. Let me try to make this clear:
- VMware is doing what they always have done – make great software (the “product”) that the hardware ecosystem partners with – to build the solution (in this case hardware appliances)
- … and BTW EMC is doing what we have always done – working to try to always be there first (and more important than first – best) beside with VMware, but also recognize that we need to continue to be open (in the same way VMware must be open and partner with others).
Ok – let’s put aside what the analysts/market read into it, and let me put my own thesis on it…
I’ll make a statement that some may find controversial (but I don’t think is controversial at all): I think almost every customer should go the converged infrastructure route. Frankly, I think this is basic: there is little value about assembly and hyper-optimization of infrastructure unless you are a hyper-scale player. It’s also equally basic on a more important level: the more you can do to simplify infrastructure and let you direct more time and resources to the more important parts of getting to IaaS (Management/Automation/Self-Service/Business Management layers) and even more importantly to the PaaS layer, the better – so long as the CI architecture you pick meets your particular the app requirements.
The growth rates we’re seeing in the market of various converged infrastructure (not just ours) offers in the marketplace suggest people are voting with their dollars, and their voting in the direction of my statement above.
So – what IS controversial? That there’s no ONE “right way” to “do converged infrastructure”. Gasp! Yes, it’s true :-)
First, here’s MY definition of converged infrastructure (“CI”" for short):
“CI is always a method of using infrastructure where compute, network and persistence are treated as a SYSTEM rather than as COMPONENTS”.
- Beyond that “always” definition – there are “sometimes” variations:
- Sometimes, CI includes business value things like “single order, single warranty, single support” – and sometimes not (I would argue based on customer feedback – that this is a HUGE part of the CI value proposition – and there is a huge gulf between things that do this and things that don’t)
- Sometimes, CI includes hardware and software are packaged together (in “appliance form”) - and sometimes not.
- Sometimes, CI is based on architectures around “building blocks” and distributed SDS stacks, and sometimes around integrated systems components – and sometimes other things entirely.
This is – at it’s root, the same story as always:
- People’s brains are wired for “simple” answers and polarizing statements, particularly when the clear “right answer” has a more nuanced nature.
- “all things look like nails when you only have a hammer” position is hard-hitting (because it’s simple) – which comes from single product companies or from people/press/customers/analysts that have constrained thinking/use cases to a subset of use cases vs. the “market as a whole”)
One of the most unexpectedly popular ideas I put out there was that there was a taxonomy (or a way of grouping and ordering the seeming chaos) for all the storage ecosystem into “Four Phylum of Storage Architectures” or the “Four Branches of the Storage Tree of Life” (that blog post is here)
I think there IS likewise a “converged infrastructure phylum”, and it looks like this… READ ON!
Here’s the taxonomy of the world of CI that I’ve been floating with customers, partnering on with our Office of the CTO and with some of my SE brothers and sisters. This is the first time the world at large has seen it – so consider it a “straw-man” to contribute to/tear down:
Like any taxonomy, I’m looking forward to people trying it on for size – seeing if it works for them. Inevitably, there are things that look blurry (“is it a ___ or a ____?”). On reflection and analysis, I have yet to find something that doesn’t slot into these buckets. The grouping is (like the storage taxonomy) defined not by marketing, or positioning – but by system architecture.
Against this backdrop – EVO:RAIL is a OEM program and software developed by VMware that helps OEMs build “Common Modular Building Block Appliances”. Conversely if VMware or EMC were to build a product/program to help a broader degree of hardware variation (and software variation) it would be something to help customers build Integrated Rack Scale Architectures, for example on top of OpenCompute hardware. This is the target of EVO:RACK, being tech-previewed today (more on that here – and will talk about it further)
It’s notable that there are a couple things that all the “Hyper-Converged Architectures” (the four in blue) have in common:
- They use commodity server components.
- they use a SDS data plane (or several!) that are almost always a type 3 (loosely coupled, transactional) sometimes in conjunction with a type 4 (shared nothing, non-transactional geo-distributed and geo-scale) storage model that uses SSDs/other forms of non-volatile storage coupled with some degree (including zero percent – ergo all-flash variations) magnetic media in the servers.
Because of those two things – at first blush – people lump all 4 “hyperconverged” together into one “thing” – but they are WILDLY different.
I’ll point this out with a simple example:
- When building a product targeting “small entry price point” and “simplicity to the point of ‘minutes out of box’” and “simplified management” as the primary design centers (think EVO:RAIL – but I would extend comparisons to others in the same target design) – even IF their storage layer design is to “scale up” – CAN you hit the same goal as “rack scale” as a complete product?
- While the storage software stacks for these things (which are “common modular building blocks” in my taxonomy) are invariably technically POSSIBLE to be made as software-only (see recent OEM annoucements of Symplivity, Nutanix, or even EVO:RAIL as examples) – they are inevitably integrated hardware/software appliances. These have varying degress of variation (differences in “node” hardware) – but there’s usually a relatively small set compared to the world of “all the choices that are possible”.
- Likewise, while some are hypervisor independent - in theory – in practice, they all seem to simply just embed vSphere.
- Why? It’s because delivering a appliance-like experience (complete with appliance like support) is “inversely correlated” with variation. The super-awesome super-simple wizard, for example, doesn’t let you configure an SDN. Or an Object store. Or… (and you get to a long list).
or – put another way…
- Products targeted the hyper-scale use cases have different requirements than “start small and simple, fixed configuration” – they :
- are dependent often on Open Source management and orchestration
- “start small” is good, but far less important than “really scale up”
- are almost MORE likely to be deploying on one of the mainstream Open Stack distributions and KVM than on vSphere (more on the newly announced VIO in another post)
- … and perhaps maybe even on physical hardware using only Linux containers that are orchestrated by a PaaS layer
Well… can you make that hyper-scale product the SAME WAY as you build the “common modular building block”?
Some will answer the questions/statements above in the affirmative – I would bet they are coming from the the same type of thinking that thinks that “all storage stacks should run one way”.
When you have that train of thinking, what you end up with is the same thing that are VNX and NetApp FAS platforms today (and many “Type 1” storage architectures generally) – GOOD at many things, not GREAT in any given category. That’s NOT A BAD THING. There are swaths of customers for which that is the PERFECT answer. I would wager that the Converged Infrastructure analog to “Type 1 storage architectures” (good at many things, but has those tradeoffs) is the “Common Modular Building Block”
This isn’t to say that some of the IP that make up the CI products can’t cross over (for example, you can easily imagine how ScaleIO could be used as an ingredient a “common modular building block thing” and also as an ingredient in a “rack scale thing” – or NDFS/components of managment layer from Nutanix for that matter - but is the end product the same thing? No.
In fact, here’s my thinking on the relative strengths/weaknesses of each of the “CI architectural types”:
So – if EVO:RAIL as an OEM program (and EVO:RAIL manager as a software product from VMware) falls into the taxonomy of “Common Modular Building Block Appliances” – what should customers expect from VMware?
Well – I would expect some degree of ecosystem. EVO:RAIL has one specific requirement (minimum of a 2U platform with 4 server modules – such that a single node can be deployed as the “base building block”) – so anyone with a platform like that can play (I suspect that this is why SuperMicro is one of the launch partners, but others that don’t have this form factor like Cisco may not be).
What should customers expect from EMC when it comes to EVO:RAIL?
Astute Virtual Geek readers will recognize the platform on the left. It’s an EMC Global Hardware Engineering (the same team that makes VNX/VMAX storage engines) platform code named “Phoenix” which is a general purpose server designed for density. EMC is not (and has no intention in being!) a general server vendor – so we don’t “sell” Phoenix, but use it in all forms of appliances.
For example, Phoenix is used in our EMC Elastic Cloud Storage appliance to run the ViPR and ScaleIO stacks (which incidentally is deployed and managed using a set of open-source software and Docker containers).
Phoenix platforms will be in EMC’s EVO:RAIL based appliances.
They are sweet – dense, latest IvyBridge hardware – with little EMC “GHE love” in things like power supplies, field FRU design for serviceability and more. But while being a bit better than “off the shelf commodity servers” from folks like SuperMicro, they are intended to be priced at commodity server costs (ultimately the
‘common modular building block’ appliance cost is the economic measure of all the ingredients – of which the hardware is one part).
Now what ELSE that is unique will be in EMC’s EVO:RAIL appliance? I think people have an expectation from EMC that if it’s coming from a partner (or us) with an EMC logo on it:
- It will need to plug into our global support systems (so we need to integrate ESRS), and support models (sparing, global remote support) – we don’t think it’s not enough to just bolt software/hardware together and send it out the door, to be a true appliance, it needs to have all the things an appliance needs in terms of support.
- It will need to absolutely snap right into our channel partner program. These common modular building blocks are priced to “start small”, which means the channel is critical. This isn’t a technology thing – but is a critical “Go-to-market” thing.
- You can also bet your bottom dollar that we’ll naturally integrate technologies like (but not limited to!) Recoverpoint (more news coming soon!) to deliver the industries best DR in this product category – not what you get with vSphere Replication (which would max out with just one Phoenix node’s worth of VMs!) You can imagine VDP-A linkage to DD, and you can imagine much more.
So… Here’s the question for the thoughtful. Now knowing everything in the blog post above (and I’m pretty excited about the EMC EVO:RAIL appliance!) If EVO:RAIL and the EMC appliance is great (and it is!) why do “Common Modular Building Blocks” (as a category) have as a weakness of “Poor economic scaling past ___ VMs”?
It’s been borne out with many, many customers at this point – appliances in these categories have as their strength simplicity, but at larger scale points (think 4-8 of these “modules”), they become less cost-effective. It’s mostly because their hardware is fixed (for “simple appliance experiences”).
Even those with more variation than EVO:RAIL from EMC (and we will have multiple configurations of the Phoenix server) struggle with “but my compute/storage scale totally differently” or “I need a very compute dense” or a “I need a very persistence dense”… particularly at moderate to larger scale.
So… what about that larger scaling point? Will I say “Vblock”? Not really.
When the target is infrastructure for applications that have lower “infrastructure-level resilience” (Platform 3 apps)… you get to what “decomposed” and “redesigned” in the taxonomy refers to in the “Rack scale” category, and is Rackscale architectures are targeted. Again – for example (but not limited to) Open Compute (EMC has been part of OpenCompute since 2013) has various server, storage, rack designs that could be selected from. You need to build a hardware abstraction layer that’s very strong if you want to have a lot of mix and match.
As we look not at “common modular building block” but at “rack scale architectures”, I will scratch my head if VMware links EVO:RACK too much to other parts of their stack in the same way that they “hard bound” VSAN to the vmkernel (more on this also when I do my ScaleIO update post).
In my travels with customers in this space, their expectation is that they are likely to picking a mainstream Openstack distribution (Mirantis, Ubuntu Canonical, Redhat, etc – heck I know of a customer that rolls their own off the main trunk, but that’s rare) and run KVM or baremetal… Why?
I suspect that there is a need for true converged appliance offers here – not in the “Hyper-Scale/SP” space, but more in the “customer who is looking to build a ‘Platform 3’ PaaS IaaS layer which has low expectations of hardware resilience”.
They would need to not be as rigid as EVO:RAIL (which is VERY fixed hardware all aiming at “start small and simple above all”) but more open on hardware (Open Compute hardware – as EVO:RACK is seemingly targeting) and also more open on the software stack (as opposed to “forcing” ESX/vCenter/VSAN/NSX).
Again, the VMware team may prove me if they take their rack-scale software (EVO:RACK) directions to deploy any Openstack distro on hardware- not just VIO (again, more on that herehere – damn these topics are all interwoven!)… in otherwords if the VMware team “aims” for hardware-abstraction of OpenCompute supporting all sorts of upper level virtualization/software defined IaaS models vs “aiming” for “how to deploy vSphere at rack scale on OpenCompute hardware”)
Now, if you wanted to do a Rack Scale architecture hyper-converged appliance – you would need some ingredients.
Let’s see what’s in the tool box!
Well, it would need to take Phoenix compute platforms (and other compute platforms with different compute densities), and mix in other things:
Dense capacity storage configurations – and I mean really dense. Like 2.9PB per rack. This is the Voyager enclosure that EMC GHE produces today (used in all sorts of platforms). You would use this for CI use cases like many Hadoop clusters, or giant content depots.
60 x 3.5 slots per 4U. Perhaps not the DENSEST hardware out there, but really, really darn close.
Dense performance storage configurations – and I mean REALLY dense performance. You would use this in cases where it’s compute intensive, but you need more IOps/compute than you can get in most rackmount servers (via SSDs or HDDs).
This is the Viking enclosure that EMC GHE produces today (used in all sorts of platforms).
120 x 2.5 slots per 3U. This is the DENSEST “performance enclosure” I know of.
Next, you would need a distributed, massively scalable object/HDFS software storage stack that could run on commodity hardware – and with any abstraction model (kernel mode virtualization, physical, linux containers). Oh – we have one of those (on left)! That’s the ViPR Object/HDFS store (now underpinning the vCloud Air Object Store beta!)
You would need a distributed, massively scaleable transactional store that could run on commodity hardware – and with any abstraction model (kernel mode virtualization, physical, linux containers). Oh – we have one of those (on the right)! That’s the ViPR block store (ScaleIO).
For SOME use cases, adding in additional ingredients like DSSD are very useful (hyper-performance centric in-memory database use cases as an example).
Clearly – we would need more than just these “ingredients” to come together into a Rack-Scale CI offer. The hardest thing with these things is the management layer – and it’s harder for “loosely coupled” designs like Rack-scale with more hardware variation that “common modular building blocks”.
Along with the software storage stacks, you need another critical software ingredient… You need another layer of hardware abstraction, and how you manifest failures at node and rack levels is non-trivial.
Certainly EMC is exploring these challenges. It’s not the only effort going on in the federation (as you can put together from my hints above). Why are there more than one effort?
Because of the topic that governs the model of the federation at every level:
- PARNTER + INTEGRATE (EVO:RAIL as an example, vCloud Object store as an example, VVOL as an example)
- ALSO ALWAYS KEEP OPEN (VMware’s VIO as ONE vehicle for Openstack, their EVO:RACK tech preview - and EMC’s partnerships with other Openstack distros as another, as well as the path EMC is pursing for for Rack Scale CI)
Stay tuned – lots of interesting stuff in this space, and it’s moving FAST!
One last – and CRITICAL point… Where are Vblocks in this picture? They are converged infrastructure, integrated systems in my taxonomy.
Vblocks are the answer when someone says “my application stack has these rich protection/replication and availability SLAs – because it’s how it was built before we started thinking about CI models”, or “I have an existing network topology and networking services that needs careful integration – I need a heck of a lot more than the simple ‘give me an IP and a VLAN’ that you see in common modular building blocks”.
In otherwords – Vblocks are great at things you see at every enterprise for “brownfield” and “platform 2” stacks that have a heavy expectation of infrastructure services around resiliency. This, BTW – is the largest market right now in CI, and growing like crazy :-)
So – one more nail in the coffin of the position of “one architecture, all the time”
I’m curious to hear people’s thoughts – as this space is HOT, FAST, and FURIOUS!
Lots to think about here Chad. Excellent and thought-provoking post. I was struck by mention of ViPR in its latter parts.
Chris
Posted by: Chris Mellor | August 25, 2014 at 03:36 PM
Hey Chad, what is your vision on ScaleIO-powered appliances for VMWare-centric VDI?
Now that EVO:RAIL is announced, what is the future for the architectures like this one, based on ScaleIO and Supermicro: http://www.vmware.com/files/pdf/partners/emc/ecs_reference_architecture_v11_feb_14.pdf
Posted by: Ivan Levendyan | August 25, 2014 at 08:34 PM
Great info, thanks for the details and commentary. EVO:RAIL is a great concept, and I am interested to see where it goes from here. Not a new destination, but a new path that looks compelling.
Posted by: Mark Vaughn | August 28, 2014 at 02:15 PM
Excellent post Chad - couldn't agree more that the majority of customers should be going the converged route. Also, thank you for calling out the importance of eliminating complexity, especially around the PaaS layer with this type of deployment.
Posted by: vmrick | August 29, 2014 at 09:36 AM
Here's my read on the EVO thing...
In my mind, this program was put in place so the more traditional hardware providers (Dell, HP, IBM, etc.) can get a piece of the pie. These slower moving companies missed the disruptive start to hyper-converged and would each have to innovate a solution on their own.
In steps VMware with a framework, and suddenly all these other providers can jump on the hyper-converged path, and save a lot of R&D costs.
Meanwhile, VMware gets something out of it too. They get to maintain control of the software/managements side, while leveraging the manufacturing prowess of these big metal vendors.
Thoughts?
Posted by: Steve | September 08, 2014 at 12:47 PM