UPDATED June 10th, 2009 to incorporate more on lossless Ethernet (was in “read more” section, pulled to front)
Congratulations to everyone who worked on the standard.
The FCoE standard celebrates its one week anniversary today - the T11 standards body ratified FCoE as a standard in FC-BB-5 on Wednesday June 3rd.
I’m not so much an FCoE true believer so much as I am an Ethernet true believer. Coming to EMC from an iSCSI startup – I guess it was in the watercooler :-) As I’ve interacted with more customers, I’ve gained a better understanding on why FCoE is important. It offers a chance for a unified interconnect – covering a gamut of use cases. Since all FCoE adapters are by definition Converged Network adapters – customer can use one, and use it for 10GbE LAN, 10GbE NAS, 10GbE iSCSI, and Fibre Channel via FCoE.
I still stick by the bet I made with Chuck (come on iSCSI!) – he and I disagree often, but boy the conversation is always be fun :-)
http://chucksblog.typepad.com/chucks_blog/2007/10/yes-we-occasion.html
And you can see I’ve been interested in this for some time:
http://virtualgeek.typepad.com/virtual_geek/2008/06/10-gigabit-ethe.html
(BTW – this was June 2008 - was after a couple of months of being “deep dive” introed to what was at that point the very confidential Cisco project codenamed California)
So why is it important? On it’s own – iSCSI does not offer what FCoE brings to the table – a real chance for all current use cases/workloads - to consolidate the networks (by eliminating the “but what about this host here”… excuse) which in turn enables reductions in cable and port count, lower space/power requirements, and enable getting to a “cable once” (at least within major generational changes) model.
This doesn’t mean I’m saying iSCSI is bad or even “not as good” as FCoE. iSCSI is undoubtedly less expensive, and runs on a much, much broader set of equipment. iSCSI is the fastest growing storage segment by a long shot for all these reasons, and is often the protocol of choice for customers with no existing shared storage infrastructure. EMC is (and certainly I count myself in that) a huge iSCSI supporter. However, iSCSI does fundamentally have lossy characteristics (using TCP retransmits for data integrity), longer TCP/IP timeout characteristics and other But, without being able to cover the remaining use cases for FC – there wouldn’t be an opportunity for convergence of transport. FCoE is that opportunity.
UPDATE:
Now, there’s still more work to be done – as St0ragebear pointed out in the comments, and I think it’s worth pulling up to the front part of this article. First of all – now that the T11 standard is complete, there are other steps for it to become an ANSI standard – including a public review period. The bigger point I had originally in the “read more” section, but I will pull up here. Lossless Ethernet (CEE, DCE, IEEE Datacenter Bridging) is still not a standard, and this will take more time. This is an important part of the FCoE idea. The IEEE 802.1Qbb (Priority Flow Control) project is approved, but the standard is not done. You can find more about that here: http://www.ieee802.org/1/pages/dcbridges.html … Now back to the original text….
I’m a glutton for punishment and love this stuff – it’s fascinating to look back at the minutes, see the progress, who’s driving what, seeing company names change (Nuova Systems). You can see that all here: http://www.t11.org/t11/docreg.nsf/gfcoemi?OpenView. It’s an easy way to see who’s driving, who’s participating (and how actively) and who’s following – at least from an engineering and standards standpoint.
Stuart Miniman from our office of the CTO (who was part of the standard process) does a very good video update on FCoE (including examples of Qlogic Gen 1 and Gen 2 Converged Network Adapters or CNAs):
What does this have to do with VMware? VMware is one of the earliest, most potent use cases for FCoE and 10GbE generally. I said it back in June 2008 (and have said as much at various VMworld sessions with Cisco and VMware) and I’ll say it again – massive consolidation, coupled with massive multicore and huge cheap RAM moves bottlenecks to the server I/O layer. It’s fun to look back a year later and be borne out by what’s happened since.
EMC has been supporting FCoE for some time, and now with the standard ratified, it’s exciting times!
Read on for more gory details!
While FCoE devices (CNAs, switches, targets) have been shipping for some time – up till now, everything has been pre-standard. And in this case, pre-standard was missing important pieces – not the least of which was the FCoE Initialization Protocol – which is really important in cases where there is more than one switch involved – pretty important for any real network. These pre-standard devices were colloquially called “pre-FIP” (think “802.11 pre-N” wireless router). Every FC-BB-5 complaint device must support FIP.
To date – the existing FCoE array targets have used CNA adapters configured in target mode with custom driver stacks which were constructed to work together pre-FIP (as were the intitiators). Also the early gen 1 CNAs (which were the initiators and targets out there) were massive, with totally seperate network and FC ASICs (the early FCoE array targets used these devices). These gen 1 CNAs won’t be upgradable to the FC-BB-5 standard (at least I know for sure for one of them), which isn’t terrible on the host side (heck, put VMware host in maintenance mode, vacate VMs, replace, bring it up – rinse later and repeat), but much more difficult on the target side
When I first started digging deep here, I was frustrated we weren’t using these Gen 1 CNAs in target mode (so we could show “hey look, we’re actively supporting this!” in a 30 second sound-bite), but as I dug deeper and deeper – I understood the rationale, and over time really started to understand the upside of the fact that we’ve moved to a common, modular I/O interface across our product families.
These UltraFlex I/O modules are a hot-pluggable, non-disruptively upgradable PCIe modules (which can also be ultra dense). This means customers have an upgrade path. We ship 1GbE, 10GbE, 1GbE iSCSI hardware-based, 4Gb FC, and 8Gb FC modules, and now that the FCoE standard is ratified, we’re beavering away on an FCoE Ultraflex I/O module.
While this means that by definition EMC won’t be first to ship a product (since we have to engineer a module), it means that no matter what, there is a non-disruptive upgrade path POSSIBLE.
Put another way – while EMC may be disadvantaged by not being first (the easiest route is to take an off-the shelf PCIe CNA, configure it in target mode) from a market position standpoint, the customer advantage is there is no risk (the process for a hardware upgrade on one of those off-the-shelf CNAs on a server is one thing – just VMotion), but for an array target, it’s a little more complex. My 2 cents – I think it’s better for the vendor to suffer a little marketing positioning than it is to have a customer need to deal with any of the “new standard” mumbo jumbo. The flip side I suppose is that it allows “banging out the early kinks” to ship. My two cents if I were a customer – this is what EMC e-Lab is designed to do – we do it, we take the responsibility end-to-end on your behalf (and they have indeed been going to town and are now banging away on the FC-BB-5 based gear).
This is not to disparage the choice of others – only that as I dug deeper, I understood the choices our engineering folks made.
The decision was also fundamentally a pragmatic one. The bulk of the benefits of FCoE come from massive cable/port count, power, management reduction at the host-switch part of the network. Over time, this will extend throughout the network, including the target.
A few common questions I get….
Q: Is 10GbE a standard?
A: Yes, there are several 10GbE standards. The primary difference being the physical link layer, and corresponding distances and cable type. The major standards include:
- 10G Base-SR (fiber optic cables - the most common being the same orange (OM2) or aqua (OM3) multimode 850nm fiber optic cables used by Fibre Channel - in which case it is specifically called 10G Base-SR) and has a moderate distances which vary depending on the cable (26-82m), and uses relatively expensive SFP+ modules
- 10G Base-CX4 (InfiniBand-like cables with large, but lower cost SFP connectors) - and has a short-distance limit (15m)
- 10G Base-T (10 Gigabit Ethernet over Unshielded Twisted Pair) - this uses Cat 6 UTP over short distances (55m), and over the more traditional distance (100m) if using Cat 6a UTP.
- 10GbE SFP+ Direct Attach (10GSFP+Cu) - this uses twinaxial copper cable directly into very small SFP+ adapters, and eliminates the need for the optical elements, but is very short-haul distance only (10m)
Q: Is “Lossless Ethernet” a standard?
A: Not quite yet. IEEE 8023.1Qbb is an approved project, but currently not an approved standard. This is important to deliver truly lossless (which means no Ethernet Frame is lost) Ethernet. Remember that ethernet was originally designed to be lossy, with it being “ok” for Ethernet frames to be dropped. This means that network interface cards and 10 Gigabit Ethernet switches must carefully treat the Ethernet Frames, and communicate with each other - buffering traffic, applying priority controls, etc. IEEE is still working on the applicable standards. You can find more here: http://www.ieee802.org/1/pages/dcbridges.html
Q: Where can I go to learn more about FCoE?
A: As a new technology - one of the best places to go to find information is the standards body - in this case T11. You can see by looking at minutes that Cisco is front and center here. EMC’s David Black (EMC Office of the CTO) is a loud voice (David – you are mr keen – I think you had a near 100% attendance record :-) there in the standard body.
Also, Cisco maintains a popular site:
Stuart Miniman (EMC Office of the CTO) maintains a great deal of great of FCoE content here:
http://nohype.tumblr.com/tagged/fcoe
EMC whitepaper on FCoE:http://www.emc.com/collateral/hardware/white-papers/h5916-intro-to-fcoe-wp.pdf
Sorry to be nit-picky, but it would be more accurate to say that the FCoE *draft* standard is complete. The FC-BB-5 draft has been forwarded to INCITS for Public Review. The Public Review process is a 45-days period that leads to the adoption of the draft as an ANSI standard. So it's not quite soup yet.
Besides, FCoE was the easy part --- the underlying CEE is still a year out. The IEEE and IETF still need to finalize CEE, DCBX, and TRILL. The reality is that using Ethernet PAUSE or Per Priority Pause to recreate the flow control that's built in to Fibre Channel is going to be a lot of work from a standards perspective.
Posted by: st0ragebear | June 10, 2009 at 09:46 AM
St0ragebear - that's not being nit-picky - that's VERY IMPORTANT. I originally pointed out the state of the lossless ethernet elements in the "read more" section, but on reflection after your comment - I decided to pull it up front more prominently.
It's up there - let me know if you think it's more clear.
Thanks again - and corrections to anything I write are ALWAYS welcome!
Posted by: Chad Sakac | June 10, 2009 at 11:35 AM
well we are in the process of building a new DC, and after EMC World, and getting a good idea about where FCoE is, I am strongly pushing that we adopt it for our vSphere infrastructure that will be primarily on blades, which i think is where FCoE will have a huge impact in more bandwidth to blades, with fewer cables.
Posted by: David Robertson | June 10, 2009 at 10:37 PM
Hi Chad,
Your post raised some questions for me.
What are the use cases that FCoE covers and iSCSI with 10Ge doesn't?
What does FCoE "bring to the table" that iSCSI with 10Ge doesn't?
TCP guarantees delivery, so how is iSCSI "lossy?"
In the FCoE/DCE stack is there *ever* a lost frame? Is there a re-transmit capability in that stack to handle lost or corrupt frames? That would sound "lossy" to me as well.
Is the sole advantage of FCoE quick recovery from transmit failures?
Thanks.
Posted by: Charlie Dellacona | June 11, 2009 at 09:44 AM
Hi Charlie,
I've worked on both the iSCSI and FCoE standards, so I'll try to answer your questions:
FCoE Use case: FC SANs are already installed, a new server rack arrives cabled only with 10Gig Ethernet, but the administrator doesn't want to configure or manage an iSCSI/FC gateway in order to access the existing storage systems.
What FCoE brings to the table: Transparent connectivity to existing FC SANs, including extension & reuse of existing FC-based management software and practices.
Of course, not every facility has FC SANs installed or wants to access existing FC-based storage from new servers. If one is starting from the proverbial "blank sheet of paper," the longer initial discussion should be about appropriateness of iSCSI and/or FC for the workload and facility (there's no single answer that covers all cases). For that sort of discussion, FCoE is a version of FC that runs over Ethernet. Management complexity and technology familiarity are considerations here (as they are in the use case above).
"lossy": Chad's wording about "lossy" is not the best ;-). What I think (hope) he meant to say is that by virtue of TCP, iSCSI copes much better with losses than FCoE. Losses are inevitable - sooner or later there will be a bad CRC (good links have very small bit error rates, but they're not zero).
Courtesy of TCP, iSCSI rides through an occasional loss nicely (e.g., TCP with SACK retransmits quickly without changing the window size or backing off). FCoE doesn't do anything at the FC level for disk I/O - eventually something at a higher level (e.g., a SCSI multipathing driver such as PowerPath) notices that the I/O didn't complete (i.e., times it out) and redrives the entire I/O. Before I get dinged for not mentioning it, tape (FC-TAPE) is different in that it can retransmit only a lost portion of an I/O at the FC level rather than having to redrive the entire SCSI I/O.
Because FCoE has no retransmission counterpart to TCP, it really has to run over CEE Ethernet that is designed and engineered (both links and network topology) to not drop packets for congestion or flow control reasons. iSCSI is much more tolerant of "stupid network tricks", and will run over "lossy" networks that one should never try to use FCoE on.
I hope this helps.
Posted by: David Black | June 11, 2009 at 10:05 PM
David,
Thank you for your reply, it's nice to hear from an expert.
Re the use case, I was looking for something that FCoE could do that iSCSI could not that would be of value in the data center. I am not aware of any.
Overall your reply suggests that FCoE is an intermediary technology between FC and iSCSI to lessen disruption in transition. If that's the case, I'd have to wonder whether it was worth the effort that network and HBA vendors put into it and the expense customers will be hit with to adopt it. Intermediate steps to the future just bring extra cost and delay the benefit of adoption; particularly so when there is no re-use in the intermediate technology. Usually its better to just bite the bullet.
Posted by: Charlie Dellacona | June 15, 2009 at 08:23 AM
Charlie,
Since FCoE (via FCP) and iSCSI are both SCSI transports, it's not surprising that they have similar functionality. Since you asked, one thing that FCoE can do in principle that iSCSI cannot do is mainframe storage (i.e., FICON over FCP). Whether that'll happen is up to the mainframe folks, but it is a distinct possibility.
There are a variety of views on the relative roles of FC and iSCSI; yours seem to run towards the "iSCSI will make FC obsolete" end of the spectrum. I believe that the two technologies will co-exist for the foreseeable future (3-5 years). When FC is the storage networking technology of choice, FCoE offers significant benefits for rack-scale server integration, which is a clear trend.
Posted by: David Black | June 15, 2009 at 11:39 AM
Chad, I have question for you and since you have experience with iSCSI, what is your opinion concerning isolating iSCSI traffic? The reason I am asking is we recently encountered a serious latency issue with one of our ESX clusters that's connected to a Dell MD3000i JBOD. The interconnects are (2) CAT3750G switches that are stacked serving other Windows clients along with the ESX hardware initiators and the MD3000i targets, though they are in a dedicated VLAN. However, the VLAN tagging I found was being processed at our core (CAT6509)which is located across the street and the uplinks traverse approx 900 feet of dark fiber.
Now the problems started to really manifest just shortly after introducing the second ESX node because initially we thought our performance issues were due to a lack of server compute resources in trying to serve 150 VMs on one host. I know that sounds like a lot but a majority of those VMs are delta images and Linux virtual routers from our VM Lab Manager environment. However, the performance worsened (the ESXTOP command queues were 10-20K ms) even while VMs were turned off, so we disconnected the ESX hosts from the switches and went direct to the MD3000i and the difference was immediate as the command queues dropped to an avg of 4-10ms.
Now we think the combination of the VLAN overhead and the fact the gateway was in another building was no doubt attributing to this. However, our network guy as usual doesn't buy it and doesn't think we need to have a parallel dedicated network.
I would really like your opinion and or what your recommendation would be in our case. Also, we engaged Dell on their PowerConnect switches as they are cheap alternative to Cisco.
Thank you
Posted by: Matthew Reed | July 06, 2009 at 08:34 PM
@Matthew - uggh. Whenever I say "make sure your switches have enough port buffers"... What I am kinda saying is "be careful if using 3750s" (the Catalyst 6500s are much, much beefier).
Don't underestimate the performance needed by the lab manager use case - although they are deltas (Lab Manager uses the linked clone functionality hidden in ESX for some time, and now used by VMware View Composer as well), it doesn't change the IO (MBps/IOps) they need, only the capacity they need.
The distance thing isn't necessarily the issue - I would be surprised. Latency of a link (particularly of dark fiber connection) is microseconds. So long as it's not routed, it's not deadly.
BUT the model where you can't have good control of the network is very tough in the "when doing iSCSI/NFS for VMware - follow 'bet the business ethernet' infrastructure" guidelines.
The long and short of it - when iSCSI moves from "playing with it" to "building my storage network", you should build a IP storage network that looks, topologically, a lot like an FC network.
I would either use dedicated switches or basic port-based VLANs. Sounds to me (more information is needed) like you've got a private VLAN.
It's funny, and I am a SUPER iSCSI fan, but many, many times, the initial "oh, if we go iSCSI over FC, we could save $500 per port!" benefit quickly evaporates when it gets hairy.
Personally, if it were me, I would bring the data to the networking guy, and highlight that taking the network out of the loop resolved the issue. If he didn't want to buy it (this will be an interesting thing as the Ethernet and FC networks converge), then I would say "screw it", and just build an isolated iSCSI network with dedicated switches.
Posted by: Chad Sakac | July 06, 2009 at 10:46 PM
It's been almost 18 months, Have there been any glaring vulnerabilities with FCoE? I'm about to do a major upgrade and wanted to see if there were any huge problems.
Posted by: Dedicated Server | November 11, 2010 at 01:46 PM
@Dedicated Server - I think you're spam, but not sure :-)
I'll give the benefit of the doubt.
I know of many customers who are happily now deploying FCoE. Many of the early bumps are out.
Posted by: Chad Sakac | November 12, 2010 at 11:07 AM