By now, it should be obvious to everyone (but surprisingly it's not) that VMware is driving 4 large scale IT trends from an Infrastructure standpoint.
- A raison-d'etre for multi-core CPUs - if you look, it's not uncommon to see 20:1 consolidation ratio with today's dual core and quad-core processors in the dual socket platforms (blade or rack-mount - that's for another post :-) ). Heck with some workloads (VDI as an example), and 8 cores and lots of RAM and ESX-server's memory dedupe, much higher numbers are possible even today - 100:1 even. If you look at the very near future - Nehalem will intro with 4-core dies - but will scale to 8-cores. Each core will have two parallel execution paths (Anandtech does a great job of covering this as always: http://www.anandtech.com/cpuchipsets/intel/showdoc.aspx?i=3326&p=4
- Consolidated IO workloads - even before point 1 - even when you move from 1:1 to 20:1 consolidation of I/O becomes a core design bottleneck. It's always shocking to me that people don't realize this. Let's say you're an Dell shop, and standardize on PowerEdge 2970s. How many NICs do you have in a standard physical server? It's usually 2-4 (let's say it's 3 for the sake of being even handed). When you consolidate that only a honkin' R800 with tons of RAM with ESXi, how many NICs do you have? perhaps 12 (4 LOM, 2 x quad-ported PCIe NICs on the riser card). Two go for service console redundancy (you ARE using a redundant service console, right?!?!), then you lose two (maybe one) for VMotion - so you have 8 left. If you use either vmkernel IP-based storage (iSCSI or NFS), you will use some for IP storage, some for pNICs sitting off vswitches supporting Virtual Machine vNICs. So quick math - 60 old physical NICs are now consolidated on 10 NICs (with the potential of less if you're using some for IP storage).
- Shared storage becomes critical - This is true in a good way and a bad way. It's true in a good way that you're bringing stuff that was on lower-availability infrastructure onto things that are more solid and consolidated, and that you're gaining the massive business flexibility benefits you get from VMotion/DRS, Storage VMotion , VM HA, Site Recovery Manager etc. It's true in a bad way in the sense that it's making it a cost prerequisite for many things that use to be a "C:\ on physicaldisk0" is now happily living as a VMDK on a VMFS or NFS datastore. This is forcing all the storage vendors (EMC included!) to think about our infrastructure in a new way.
- Management tools need adapt - It just takes one thing to make this sink in. Let's say you setup an SRM recovery plan. Life is good. Then you svmotion one of the VMs out of the container being replicated - how valid is that Recovery Plan now? Management tools need to integrate with Virtual Center, and have as a core premise that the app container is a virtual entity. This has impacts that aren't immediately obvious - but perhaps more important than all the other stuff put together (topic for another post).
Today, I'm focusing on 2 - and sharing why I think that 2009 will go down as the year when 10G Ethernet takes off, and why VMware will be the thing that makes it happen. Interested? Read on.....
Back when I was in the valley (so this was 4 years ago), a buddy of mine worked at a tiny IP (not Internet Protocol - Intellectual Property) company focused on high-end IP blocks for networking and storage (going after Hi/fn and the others in that space). He showed me their A0 spin, and told me "This chip will do 10GbE BaseT over Cat6 cables, full TCP offload including segment offload, and all iSCSI offload". COOL. "oh and we think we can mass produce it for $25 per chip". DROOL.
Well, fast forward 4 years, and they are out of business :-)
The beautiful thing in Silicon valley is that people aren't afraid of failing, they are afraid of failing to try. And you know what - they were right. They were only early.
At the time, I was at an iSCSI storage startup, so of course it was only natural that I thought it was cool.
Ok - so - what's the point. VMware drives 10 Gigabit Ethernet demand - the reason is the simple point of #2 - consolidated network workload (also why our general recommended backup solution for customers very focused on VMware is Avamar - which does deduplication before the data leaves the ESX server)
In ESX 3.5, VMware added support for a series of 10GbE NICs (NetXen, Neterion, Intel's 10G XR) - http://www.vmware.com/pdf/vi35_io_guide.pdf (check out starting page 24). ESX performance with 10GbE is fantastic (very efficient networking stack). This is covered every VMworld - I'm sure this year will be no different - here's the graph from VMworld 2007 (a great session: "TA43 High Performance Virtualized I/O in 10 Gigabit Ethernet Era" presented by Howie Xu from VMware)
Funny story: We're on call with the VMware team as we were working on qualifying the ESX 3.5 release and asked :
EMC: "iSCSI over the 10GbE interfaces - is it supported?"
VMware: "no, why would anyone do that"
EMC: "oh, trust us, they'll do that. What about NFS datastores then?",
VMware: "no - no vmknic (BTW - this is VMotion and and IP storage) over the 10GbE interfaces supported at the 3.5 release".
Now, a few quick things:
- Don't take this to mean that VMware isn't an IP storage supporter - they absolutely are. They are just resources constrained, like we all are. They are also some of the smartest engineering folks I've ever met.
- iSCSI and NFS absolutely work with the 10GbE interfaces. The support model is clear (at least to me) - if the SW iSCSI initiator is on the HCL with an iSCSI target, or if the NFS server is on the HCL, then you should be good with any Network interface that's on the HCL.
So why did VMware say that? Answer below.
This is the data from this year's IDC "Server Virtualization 2007 Study". This is an annual study (it is published at the end of each year - so this is Dec 2007) of every topic of Server Virtualization and they survey a broad set of customers across a broad, broad set of questions (it has great info like what people are virtualizing, on what platforms, uptake rates of Hyper-V and VI3, how customers are justifying it, and what they're seeing). IDC also tends (at least to my eyes) to be very independent.
One thing - it's not market share, but rather "a market study"- a study of 410 customers - no more, no less. There is FASCINATING info in there - one day I might find time to dig out other nuggets. It's also done annually, so you can see annual changes too, which is nice.
"Chad what's your point?" - there are 30 percent of the customers that have I/O consolidation issues with IP storage, but 100% of customers has an I/O consolidation issue with straight up networking - that's why VMware focused there first.
So - what's it going to be on the storage side? There's no disagreement that the future is an Ethernet-connected future.
Note: Customers currently investing in FC - it ain't going away anytime fast, and you basically are investing in something that solves the consolidated I/O workload for you today - IT isn't about the latest shiny toy, it's about things working - if FC works for you - FANTASTIC.
I am so not into protocol and transport wars - BUT that still doesn't change the fact that the future is Ethernet-connected. So, then what about protocol? iSCSI, NFS, or FCoE? Well - NFS will continue to do well - it works well, there's nothing wrong with it - and it will always have the strengths that it has in the VMware context (so easy to create massive datastores that span ESX clusters or even sites). iSCSI will continue to grow wildly (it is the fastest growing in the market at large, and in EMC's portfolio) and is (IMHO - I'm still in love) the future of the block storage market en masse. BUT, I'm starting to come around on FCoE. There are three reasons:
- In working with the largest customers, there are some workloads at our larges customers that demand "lossless" (look up per-session pause) and ultra-low latency (where literally a few ms is make/break. I'm not claiming that they are everywhere - that's the iSCSI market - but where they exist they are very specific - so those customers need an answer. BTW - When people try to apply that "lossless", "ultra-low latency" DCE (datacenter ethernet) to the storage workloads as a whole (i.e. claim that iSCSI is the wrong way - and FCoE is for everyone), my answer is simple:
"iSCSI works great for many customers today. It does that as is. Don't underestimate the power of being able to ping your storage target"
- The vendors aren't introducing "HBAs that have Ethernet" - they are all releasing "converged network adapters" - a single device that is a NIC (with all fancy offloads) and an FCoE HBA at the same time. If you can have both, and the incremental cost is zero - why not? You can always run an iSCSI stack on top of the NIC!
- Here's the Qlogic example: http://www.qlogic.com/Products/Datnetworking_products_landingpage.aspx
- Here's the Emulex exmaple: http://www.emulex.com/products/fcoe/index.jsp
- Here's the Intel example: http://download.intel.com/design/network/prodbrf/317796.pdf (I'm assuming Intel will go the way they generally do which is a software stack - seems crazy at 10Gbps, but it's not - most customers I talk to are using the ESX native SW initiator or the Microsoft iSCSI initiator and getting great results at 1Gbps - which everyone said would be crazy. In many cases - it performs the same as the hardware implementations)
- All the players are supporting it - there is some writing on the wall factor - and no - it's not a conspiracy (that's a Chad theme - don't trust conspiracy theories - the simple, obvious answer is the right one - Occam's Rule applies). It's simple - it's the only way you hit all the use cases at once.
EMC's been selling a 10GbE target for a while (the NS X-blade 65) but there are very, very few customers. The customers that exist are in very specialized vertical markets - there hasn't been a broad-based reason for 10GbE, particularly at the historical price points. Now, 10GbE LOM is close, and there is a new compelling reason. VMware is that reason. We've done performance testing in that RTP facility with ESX 3.5 and the NS x-blade and got killer performance results.
10GbE also solves the consolidated workload in one fell swoop - important today - CRITICAL in the massively multicore future of 100+:1 consolidation
One thing I wonder about is whether it HAS to be Cat6 (or some other form of twisted pair) to get mass acceptance. Historically this has been the case. I mean 1GbE didn't get adopted until 1G base-T, and the next thing you know, your laptop has it built in. I'm not sure if that's going to happen this time. Twisted Pair is getting hard at these really high frequencies (man - looking back my university thesis was a free-space optical 10Mbps link using hot-off the presses laser diodes that cost a fortune - amazing how fast things move). The other issue is power - very high frequencies, with high loss means very high transmit/receive power.
This is a big question for me - what's the "how much do I need to change factor?" - as has been well covered in the "Innovator's Dilemma" - disruption comes bottom up, not top down, and eventually good enough is good enough.
It's not going to be Infiniband, that's for sure (again, notable exceptions - and EMC will support every protocol - trust me), and I don't think it's optical. But if it can't be Twisted pair (yet to be determined - but taking way to long to be a good sign). I dunno. I think Cisco might be on to something with their new SFP+ and Twinax http://www.cisco.com/en/US/prod/collateral/modules/ps5455/data_sheet_c78-455693.html. It feels more like twisted pair, and passes my Chad "I like grey, cheap, flexible cables, not orange, expensive, cables"
So - if you could have a converged network supporting your ESX servers, with a truckload of bandwidth to each host, with the ability to carve the pipe up for network and IP storage (regardless of NFS, iSCSI, or FCoE - and in some cases both), applying QoS to VM-specific channels and have that carry all the way through the host, the adapter, the fabric to the array - why wouldn't you do it?
Now - here are my questions for the intrepid readers:
- Do you agree with my core premise - i) VMware's consolidated I/O demands a converged, but virtualized I/O fabric; ii) that fabric will be 10GbE, and 2009 is the inflection point year for 10GbE
- What's are your thoughts on the cable plant question?
- What do YOU run today - and what will you be using in 2009, 2010 and beyond?
I couldn't agree with you more on these questions. Although iSCSI seems to have a bad rep when it comes to VMware, I never witnessed a slow setup. iSCSI is easily expandable and definitely the future and with 10Gbe the so called restraints are all gone.
How do you feel about jumbo frames and did you test the performance gains when using jumbo frames with the iSCSI initiator? Although it's not supported yet I guess it could be really beneficial!
Posted by: Duncan | June 20, 2008 at 03:01 AM
Today's shipping IB solution is 4x (40Gbit) the speed of 10GigE, which will likely not ship FCoE until at least late this year. Tell me again why a 4x performance solution is not compelling.
Posted by: Michael Anderson | June 21, 2008 at 12:08 PM
Don't weaken on your the FCoE stance! The last 30 years are rife with stories of the highest end customers finding the new disruptive technology unacceptable. They always wind up adapting to the lower cost solution. Single network convergence is such a powerful idea that it will be irresistable, and that means the whole IP stack - not just the media. FCoE is just a last gasp.
Posted by: Charlie Dellacona | June 22, 2008 at 07:53 PM
Great post on a topic which has interested me for quite some time.
Personally, I've always found the idea of a block-level storage protocol running on top of Ethernet a very compelling one. At the same time, I've always been skeptical about using TCP + IP as a storage protocol (ie iSCSI), due to the extra processing (which equals latency) required by the added layers of indirection.
I've never quite been able to make sense of why iSCSI is such a great invention. I mean, sure, IP is a familiar idiom to most IT pros, but I wouldn't say that FC is all that hard to "get" either. And the argument that iSCSI is a routable protocol simply makes no sense at all to me... if you intend to stick a router inbetween your storage clients and the storage system, I must only assume that your performance requirements are so low that you would probably be better off by hosting your data in a NAS type of solution, whereas if you are at all concerned about latency you are probably going to want to stick your clients and storage on the same logical IP network anyway. In which case, TCP + IP helps you... how exactly?
FCoE on the other hand is a storage-specific protocol with low latency, and also uses a familiar idiom, ie Ethernet. It does not route (out of the box, but ethernets can easily be extended across WAN links), but imho this is a good thing.
I wish I had access to a storage system with both FC and iSCSI heads, so I could benchmark how the theoretical differences between the two protocols appear in practice, but I would be very surprised if EMC for instance didn't have a whitepaper or ten on the subject.
What I am primarily interested in is how, all other things being equal, iSCSI affects the IO/s metric in random IO for large datasets - a worst-case workload for any storage idiom, which tends to strip away all the boosts that caching, prefetching and other clever things you might stick in a storage system to improve performance, leaving you with a view of the raw average spindle seek times, and protocol processing overhead as the primary bottlenecks.
Maybe I'm completely off the mark here? Curious, anyway..
--
Ole
Posted by: Ole André Schistad | June 23, 2008 at 04:49 AM
On the cable/plant question I think single mode fiber will rapidly displace copper especially as we move to higher speeds (10 gig and higher). Existing Cat 5 cable needs to be replaced anyway and fiber is future proof up to 1 terabit. The Blazar optical active cable from Luxtera with CMOS silicon photonic transceivers in both ends is already being trialed.
Posted by: Brian | June 24, 2008 at 12:46 AM
Great Post.
Posted by: David | June 29, 2008 at 03:46 AM
"I like grey, cheap, flexible cables, not orange, expensive, cables"
I can't resist - orange cables are just so yesterday - aqua is the new orange :-).
Seriously, orange is the standard sheath color for multi-mode OM2 optical fiber. If one is going to spend the money on new multi-mode optical fiber for a data center today, it should be on OM3 laser-optimized fiber (needed for 300m runs @ 10Gig), and its standard sheath color is aqua.
Posted by: David Black | July 10, 2008 at 09:23 AM
Good article - but all the arguments you make work even better with true virtual I/O. 10 x 10GigE NICs on a server? Well, maybe if you really like 3U servers with 5 NIC cards in them! We're seeing that Xsigo's virtual I/O (multiple 10Gig/1Gig/FC links over infiniband) is the way to go. Two connections per server (for reliability) and VMWare can have all-you-can-eat network connections to the outside world.
Posted by: Enkiguy | August 06, 2009 at 03:52 PM
Take a look at the best practices white paper that I wrote regarding 10G and VMware vSphere 4.
Simplifying Networking using 10G Ethernet -- http://download.intel.com/support/network/sb/10gbe_vsphere_wp_final.pdf
Brian Johnson
Intel Corp -- LAN Access Division
PME - 10G and Virtualization Technologies
Posted by: Brian Johnson | March 09, 2010 at 11:35 AM
AnandTech's article - 10Gbit Ethernet: Killing Another Bottleneck?
http://it.anandtech.com/IT/showdoc.aspx?i=3759
Posted by: Brian Johnson | March 10, 2010 at 11:34 AM
like it
Posted by: a | May 11, 2010 at 08:49 AM