« UPDATED (old VSA is retired!!) Get Yer Celerra Virtual Machine here! | Main | Psst... Wanna see something cool? »

June 24, 2008


Feed You can follow this conversation by subscribing to the comment feed for this post.

Nick Triantos


Love iSCSI but I don't think it's in the cards...FCoE with I/O convergence and support for lossless Ethernet will kill it. I'd be willing to make a bet on this. I look at Cisco, Brocade, Qlogic, Emulex and see the heavy investments made there and it looks bleak for iSCSI.


Re: Optical plant

The slow displacement of copper by fibre is probably a number of things:

- Cost. There is still a difference and we all have accountants to answer to.
- Handling. Fibre has a higher "handle with care" threshold.
- Inertia. Installed base of copper plant and ports.

Everything seems to ship these days with built-in GigE RJ45. Fibre will reign supreme when the same can be said for SFPs.

Chad Sakac

Nick - I'm furiously working the FCoE angle myself also. I'm totally a "touch it/see it" kinda guy. I like the story, and I see the logic. As I said in the original post: "there are some workloads at our larges customers that demand "lossless" (look up per-session pause) and ultra-low latency (where literally a few ms is make/break."

What I'm saying isnt' that FCoE won't be huge - I'm eagerly awaiting my new Cisco/Brocade/Emulex/QLogic toys for the lab for joint solutioneering, and expect hands-on perspective in weeks.

Rather - what I'm saying is that I don't think it's an either/or. I think iSCSI is more natural in the entry-to-mid market. This is particularly true if you can't get the parts at your local Fry's (like SPF+Twinax). IMHO (and everyone, take a breath, we can discuss this like gentlemen/ladies) FCoE will rule the high end, and the mid-market and entry will be iSCSI. The latter two markets are larger than the first. We won't loose sight of the first (where EMC was successful first), but need to invest heavily in the others as well.

Now, I'm ALWAYS game for a nice bet with a respected colleague - so what will it be :-)

You can join in the bet here: http://chucksblog.typepad.com/chucks_blog/2007/10/yes-we-occasion.html


Chad Sakac

Marty - thanks for the comment. I'm really interested in reader perspective here, because I think the physical/link layer aka cable plant question is the crux.

I hear you. So, the SFP+Twinax is the only thing that has the cost/handling equation even close. But, your point on the RJ-45 vs. SFP is right on.

So here's the question - if that (use RJ-45 connectors with twisted pair at the super-high frequencies demanded) can't be done within the power/distance/costs limits that are inherent, does that translate into 10GbE being in the same limbo-ish state as IB? I'd have to imagine that if people could ship 10GBASE-T with CAT 6A or CAT7, they would be doing it now - right?

Nick Tiantos

There's a great FCoE book that goes into the weeds from Silvano Gai titled "Data Networks and Fibre Channel over Ethernet". Silvano used to be with Nuova Systems now part of Cisco and one of brains behind FCoE. For anyone who doesn't have it and is interested in getting a serious FCoE bain dump, I highly recommend it.

As with every new technology, initially, FCoE adoption won't be big. In fact I don't suspect we'll see arrays out natively doing FCoE for another 18-24mos. The current available CNAs (both initiator and target) need some work. However, I agree, in the long term (i.e 5-6 years), it'll become the dominant interconnect on both mid and high-end. Low end, it'll be a matter of economics which means if the pricing is competitive people will go for it.

Because in technology history tends to repeat itself, I suspect we'll end up seeing a little of what is starting to happen with SATA/SAS/FC drives where SAS is starting to expand into both FC and SATA sides. It's more cost effective and as fast as FC, more reliable and faster SATA...

Jon Blazquez

Great post! First of all, I'll will post this here but may be the right place was http://virtualgeek.typepad.com/virtual_geek/2008/06/10-gigabit-ethe.html#comments.
I arrive late (heavy workload!!!!! ufff),

First of all, a question Chad: I think the picture of the 10GbE performance w/ 3 VMs(in the original post: http://virtualgeek.typepad.com/virtual_geek/2008/06/10-gigabit-ethe.html#comments) is achieved using VMDq, isn't it? (I've seen it in another part but I don't' remember it...). Put an eye into VMDq, it's great!

Related to your questions, I agree with your core premise(VMware's consolidated I/O demands a converged, but virtualized I/O fabric) and I think too that 2009 is the inflection point year for 10GbE (**IN THE US**).

I think that InfiniBand and 10GbE will coexist within a large period of time: just like FC and iSCSI.

I explain my point:
One thing I think is crucial in this discussion is the SEGMENTATION issue. What are we trying to solve with I/O virtualization? Performance? Too much NICs?
In a typical server virtualization environment we normally use 6-8 NICs and 2 for storage (2 HBAS: FC or iSCSI), that makes a lot of NICs.
One of the problems that I/O virtualization can hit is bandwidth, but it is not the major issue, IMHO.
If we use the 10GbE's performance as the major advantage, We will see that in three years the 10GbE *is not enough* and that the servers begin to have 6-8 10GbE NIC, just like now!(but 10 times faster).

I think the MAJORs issues that I/O virtualization have to solve are: SEGMENTATION and the Over-subscription. And the MINOR issue is bandwidth. It's important too of course, but if we really need BW, we don't virtualize the I/O. If we need CPU we don't use Server's Virtualization and if we need I/O we don't use I/O virtualization. "Virtualization is about over-subscription"

I am sure you'll agree with me that the requirement for a high number of network adapters is *primarily* due to segmentation issues rather than raw performance(BW) issues. (A ESX server with 12NIC doesn't need normally 12GBps!!!)

Normally you need 12 NICs because you need to segment your network layout in order to have RJ45 physical copper ports to plug somewhere.

Segmentation, is not a TECHNICAL limitation (there are a number of technologies(VLANs and VMware Port Groups) that could allow you to logically segment all networks above).

It is typically a design decision based on best practices(bad BP) and customer's internal POLITICS(powerful forces!!!!!!!!!!). Usually the reason is because people don't understand or don't trust Ethernet virtualization techniques (VLAN's)

What happens is that :

ON THE ONE HAND: the bigger the the project is (the bigger the customer is) the more stringent the politics are.

ON THE OTHER HAND: We know that the bigger the customer is, the use of FC is preferred.

ON THE THIRD HAND :-) The bigbig enterprises that started with FC, have almost no adoption of iSCSI in large data centers.

I think this will be a possible market for InfiniBand: Big Big costumers.

In my opinion, this technology is appealing because:

a) VMware is supporting it in the VI3.5

b) Infiniband technology can be "bridged" into legacy datacenter I/O architectures such as standard Ethernet and Fibre Channel devices. (No one would want to replace its datacenter network infrastructure: No with 10G, nor with IB)

Bridge: Basically, installing a single Infiniband (IB) host adapter into each server, you can create a number of "virtual ports" that would map into the IB switches and in turns into the IB-Bridges to connect to your legacy Ethernet infrastructure.

This technology allows you to "expose" the same networks you plug into the IB-Bridges all the way into the ESX Servers using a mix of virtual IB Ethernet adapters and VMware Port Groups

Having said this, I think that both technologies are great: 10GbE(check VMDq and Jumbo Frames!!!!, I really like iSCSI) and InfiniBand. I think both will coexist and we will not see a massive adoption of IB (But not for a tech reason).

Chad, thnx for the graphs and info!
What is your opinion?

Ole André Schistad

Many thanks for the great answer to my question re: iSCSI and performance.

First of all, I realize now that I should have phrazed myself differently. My post was written in the context of FCoE versus iSCSI (but this was not clear in my post, sorry).

I also read, and agree 100% with, your wish to avoid religious wars on protocol X versus protocol Y. So please read the following with as being written tongue-in-cheek, with no flammatory intent whatsoever :-)

Now; the compelling reason for iSCSI (as I see it) is the familiarity and low cost of the equipment, and not the fact that it runs on IP (since IP only really matters if you need to route, which you definitely do not want to with storage anyhow).

On seeing your benchmark results I have to admit that my initial argument regarding overhead is moot - easily solved by throwing hardware at the problem - but I'm still slightly puzzled as to why iSCSI was formalized and embraced by the storage industry whereas FCoE still isn't for sale.

If we agree that being routable is not an argument, I'm really curious as to why the industry, way back then, chose to pursue SCSI over IP rather than SCSI over Ethernet (or FCoE, as it turned out).

Disregarding the separate problem of bandwidth for the moment, I would have thought that it would be cheaper and just plain easier to implement a "SCSIoE" protocol, and reuse the framework of FibreChannel to handle multipathing, than to basically write a whole new stack of protocols, including a new discovery mechanism, if the main purpose of the exercise was to replace expensive and unfamiliar equipment with cheap, off-the-shelf, components.

But maybe this is only obvious on hindsight? Or maybe the implementation of SCSI (okay, FibreChannel) over Ethernet is a harder to problem to solve than doing the same over IP?

Again, I am extremely curious about the whole subject, and definitely not trying to ride a hobby horse here :-)


You are right that the cost of optical/fiber has been the gating factor to its adoption in data centers etc. But contrary to commom perception it is not the fiber that is expensive but rather the optical transceivers. Traditionally, optical transceivers are made of esoteric materials such as lithium niobate, indium phosphide, gallium arsenide and other III-V class semiconductor materials. They are also manufactured in discrete parts and hand assembled and so their cost is very high... in the range of $350-700 per port. This has made them appropriate for the longest haul segments of the network and as costs have come down fiber has moved to the local, metro, and now to the home. If you look at the gross margins of a optical transceiver company (20-30%) you can see how expensive they are and how electronic transceivers in the data center have a significant advantage. The change now is that silicon photonics is coming of age and a company like Luxtera is now manufacturing optical transceivers in standard CMOS semiconductor processes. Optical transceivers in CMOS have cost parity with electronic transceivers which makes fiber cabling in the data center inevitable. When you consider that Cat 5 cabling needs to be replaced anyway to support 10gig transfer rates the decision to adopt Single Mode Fiber is obvious I think. Very good white paper on this topic at the Luxtera website.


I personally think that iSCSI has all the performance capabilities of a fibre channel system, it just depends on how you customize it. They are also far more affordable and accessible.

Chad Sakac

Quick comments to the questions/comments (and wow - love the dialog - keep it coming!)

Jon - I think that you're right that traffic segementation and QoS mechanisms become critical with the consolidated network - there are emerging IO virtualization technologies for both the network and FC side. These ideas are equally critical regardless of the physical and link layers. BUT - the question is which will be the big winner.

Ole - the SCSIoE (aka FCoE) makes sense using your logic, but if you were fundamentally a networking person (no offense), you would say: "storage is an application, and therefore must exist above the transport layer in the stack". That's certainly the purist view, and you do get some good thing. Also, without lossless ethernet, you really DO need transport-level retransmits.

Brian - you had me, you had me, and then you lost me with the vendor plug. At least, if you're going to do it, make it subtle. Your point on the cost of the transceivers is true.

TSS - I agree, up to a point. When you need high thoughput - can you do it with iSCSI - of course. BUT, you either are doing massive 1GbE (eventually this becomes a non-trivial problem). 10GbE of course makes this moot - but then we ask my main question (less iSCSI vs. FCoE) - what will be the cable plant?

Stuart Miniman

Chad - On the cabling plant, it still looks like we're trying to predict another market that is not there yet.
As you mention, your choices are optical, 10GBASE-T and Twinax.
Twinax has good price and power, but will be limited to only those environments where you are really staying in the rack or row - it does not plug into structured cable environments (which almost all data centers have today). Cisco's FCoE solutions require 10GbE today, and optical and 10GBASE-T are too expensive for 2008 deployments.
For some reading on cabling infrastructure - try http://www.ethernetalliance.org/attachments/127_10GBASE_T2.PDF for details on 10GBASE-T (good analysis of data center distances and also shows the shipments of Cat5e, 6, 6a - while 6a is "preferred", 6 should be OK for many and that means that many customers will be able to use their existing cabling). Many vendors (including Cisco) working hard at bringing down the power and price of these 10GBASE-T solutions and they should start showing up in the next year.
As for those trying to push for all optical, customers aren't ready to rip the existing cable plants if they can be reused. 40G/100G may require new infrastructure, but that will be many years away. It looks like 10GBASE-T will be able to extend the existing install base. Copper in the racks and optical to the core is pretty common.
Looks like customers will have options...


Chad - "I'd have to imagine that if people could ship 10GBASE-T with CAT 6A or CAT7, they would be doing it now - right?"

Having just read that PDF that Stuart linked to, the most interesting info that answers your question is the installed plant percentages over time graphic (page 25).

It looks people have, or will have in the near future, installed plant and structured cabling of high enough quality to run 10GBASE-T if and when the price is right.

The best comment in relation to price is on page 11:
"Historically, the common wisdom for justifying widespread adoption of the next speed of Ethernet has been achieving 10 times the throughput for three to four times the cost. In that respect, by 2007, 10 Gigabit Ethernet had not yet reached its full potential. The cost of deploying 10 Gigabit Ethernet is still much higher than the desired three times to four times the cost of Gigabit Ethernet. Also, at least on the end-node side, the throughput is much lower than true 10 Gb/s in many cases."

So I think it's going to be a case of continuing to install high quality copper running 1000BASE-T until such time as 10GBASE-T hits the price/performance sweet spot.

Chad Sakac

Guys, this dialog is fantastic... Thank you. So, I read the ethernet alliance doc (thank you stuart). They highlighted one thing (not called out specifically, but noted throughout) which was that the PHY for 10G Base-T over UTP required high, high power - in the 7-10W range. This means that it's not a fit for LOM. Couple this with the point that Marty calls out (3x cost for 10x higher throughput as the breakpoint) - I wonder if maybe I'm wrong, and it won't be 2009, but perhaps 2010 that is the year of 10G Ethernet...

The point of building any new cable plant with Cat 6A or 7 is a darn good one.

Still - I would never bet against innovation - can't wait to see. This is the fun of the IT industry....

Chad Sakac

Oh, one more thing I forgot to mention. the power thing... Was interesting to see the way they planned to shift to lower power transmit db levels for distance-based power consumption.

The VMware use case we've been discussing is a "top of rack" aggregation model - so short-distance transmit/receive options are a possibility.

Lastly - it's telling to look at the date of the article - Aug 2007. That's a long time ago - yet we haven't seen this 10G LOM that they've been talking about.

David Black

As one of the people who was involved in iSCSI from the beginning, I thought I'd comment on some of Ole's questions around iSCSI.

"If we agree that being routable is not an argument," We *definitely* don't agree - that approach locks the storage access topology to the network LAN/VLAN configuration; requiring the initiator and target to always be on the same LAN or VLAN is a serious restriction. I completely agree that iSCSI routing should not be done via a stereotypical IP router that adds many milliseconds of delay - try layer 3 support in a layer 2/3 Ethernet switch. Also, FCoE is "routable" in that an FCF can in principle forward FCoE traffic across different LANs or VLANs. See p.75 of my good friend Silvano's book if you don't believe me ;-).

"I'm really curious as to why the industry, way back then, chose to pursue SCSI over IP rather than SCSI over Ethernet (or FCoE, as it turned out)." Ethernet was definitely not lossless at the time, and not having to reinvent TCP was a significant advantage. Without TCP, a dropped Ethernet packet usually costs a SCSI timeout and a SCSI I/O redrive; if that redrive times out, the SCSI I/O is usually failed. Another thing that made a significant practical difference at the time was that iSCSI could be (and was) implemented entirely in software on existing hardware. In contrast, FCoE in practice requires new hardware to support lossless Ethernet, starting with switches.

"Disregarding the separate problem of bandwidth for the moment, I would have thought that it would be cheaper and just plain easier to implement a "SCSIoE" protocol," That'll be a $5 fine (a nickel's just not worth what it used to be) for trivializing what it takes to reinvent TCP. It's easy to design a reliable transport protocol; designing one that is robust to "stupid network tricks" that happen in real networks is a lot harder. I did see at least one attempt at a SCSIoE design at the time the iSCSI work was being done; a lot of time was spent (wasted, IMHO) there in reinventing TCP.

"and reuse the framework of FibreChannel to handle multipathing," That was actually tried; the result was called iFCP - see RFC 4172. On the technical front, iFCP ran into problems caused by Fibre Channel's dynamic address (FCID) assignment and use of FCIDs in ELS payloads.

"than to basically write a whole new stack of protocols, including a new discovery mechanism," What new stack??? iSCSI and iSNS are the only two protocols that are widely used in practice, and ESX does not support iSNS.

"if the main purpose of the exercise was to replace expensive and unfamiliar equipment with cheap, off-the-shelf, components." That purpose was a subject of debate at the time. Things have turned out as Chad described - iSCSI has thrived in markets below Fibre Channel's enterprise market. If you don't think this matters, I suggest "The Innovator's Dilemma" (Christensen) as interesting reading.

Werner Ladders

I agree that iSCSI has all the performance capabilities of a fibre channel system, it just depends on how you customize it. They are also far more affordable and accessible.

Verify your Comment

Previewing your Comment

This is only a preview. Your comment has not yet been posted.

Your comment could not be posted. Error type:
Your comment has been saved. Comments are moderated and will not appear until approved by the author. Post another comment

The letters and numbers you entered did not match the image. Please try again.

As a final step before posting your comment, enter the letters and numbers you see in the image below. This prevents automated programs from posting comments.

Having trouble reading this image? View an alternate.


Post a comment

Comments are moderated, and will not appear until the author has approved them.

Your Information

(Name and email address are required. Email address will not be displayed with the comment.)

  • BlogWithIntegrity.com


  • The opinions expressed here are my personal opinions. Content published here is not read or approved in advance by EMC and does not necessarily reflect the views and opinions of EMC. This is my blog, it is not an EMC blog.

Enter your email address:

Delivered by FeedBurner