Welcome back to the next article in this series. This one took a little longer to come out thanks to our "friend" the coronavirus, which in the last couple of weeks has confined most of Europe and America inside our homes. With that said, I hope you and all your loved ones are safe and healthy, and please do stay home and only go out if strictly necessary! Without further ado, let's get on with it, and I hope this article proves an interesting reading that can take your mind away from the pandemic for a few minutes.
Like I mentioned at the end of the last article
, one of the typical arguments you’ll see Fibre Channel proponents making — and obviously Brocade makes this argument too — is that Fibre Channel is a technology that was purpose-built for storage. But, is that exactly so? I’m not old enough — although not by much — to have been there for the original discussions behind the creation of Fibre Channel as a new networking protocol, and certainly companies like Brocade, Gadzoox, Ancor, Vixel and other startups did primarily focus on the storage use case for the products they were developing in the mid-’90s, but Fibre Channel was designed as a ‘transport’ protocol — although not in the layer 4 (L4) sense as defined in the 7-layer OSI model — that could, well, transport other protocols on top of itself.
Fibre Channel protocol stack
In fact, the highest layer of the Fibre Channel protocol stack, designated as FC-4, defines the mapping of different Upper-Level Protocols (ULPs). As you can see from the diagram — taken from this tutorial
— multiple ULPs were defined, including now obsolete ones like HIPPI or IPI, networking ones like ATM or IP, and some that are still in use today like SCSI or SBCCS (FICON). Some of Brocade’s earliest customers were media companies running video streaming applications on IP over Fibre Channel (IPoFC), which at the time outperformed Ethernet by a significant margin, as it struggled to transition to gigabit speeds with inefficient TCP/IP software stacks, while Fibre Channel had highly efficient hardware-based stacks and was transitioning from 1 Gbps to 2 Gbps. Brocade switches supported some specific features for IPoFC and all Fibre Channel HBA vendors had IP drivers in addition to their SCSI drivers for all major operating systems. However, we all know that it was ultimately the storage use case the one that propelled Fibre Channel to the position that it is in today, and companies like Brocade to where we are, and other use cases slowly faded away; now HBA vendors don’t have IP drivers anymore and Brocade Fabric OS doesn’t support the handful of IPoFC-specific features it once did—even if IPoFC can still be used today, mainly for in-band management of Brocade switches. So even if Fibre Channel wasn’t exclusively designed for storage, it might as well have, since it received decades and millions of dollars of R&D pretty much exclusively for the storage use case, both in open systems and mainframe environments.
That led to generation after generation of ASICs, operating system and management software versions that, time and again, focused on more than just the speed bump that came with it. Dozens of features designed to more efficiently and reliably deliver thousands of concurrent mission-critical, high-performance, low-latency storage flows from their source to their destination, while monitoring every single frame of every single flow
in real-time, giving administrators the ability to measure I/O performance down to the individual storage LUN — or namespace ID (NSID) in the case of NVMe — even down to the VM level in virtualized environments, measuring not only throughput but also latencies, IOPS, first-response times, pending I/Os and many other storage-specific metrics that are incredibly valuable for the storage administrator.
Fibre Channel implements a buffer-to-buffer flow control mechanism, by which every transmitter knows, upon link initialization, exactly how many buffers the receiver has to hold frames before it processes them. The receiver grants the transmitter as many buffer ‘credits’ as buffers it has. The transmitter then keeps track of how many buffers the receiver has by decrementing its credit counter (credits) by one every time it transmits a frame and incrementing it by one every time it receives a signal called Receiver Ready (R_RDY) from the receiver. This simple, proactive mechanism ensures that the receiver is never overrun by an excess of frames from the transmitter that it cannot hold in its buffers and is therefore forced to discard. It is the mechanism the NVMe-oF specification deems ‘ideal’ for transporting NVMe traffic over a network, as it is the same mechanism that PCIe implements for internal NVMe storage inside a server. Buffer-to-buffer flow control has proven over decades to be a very good and reliable flow control mechanism for handling storage flows in a network.
Brocade developed a feature as part of our Fibre Channel ASICs that has been available since our first generation ASIC called Virtual Channels (VCs). VCs automatically segment every ISL between two Brocade switches into a number of ‘lanes’, each with its dedicated buffer pool to provide independent flow control for each one of them, in addition to dedicating a small number of VCs to the special traffic that exists to run the distributed fabric services — known as ‘Class F’ traffic — so that such traffic always has a dedicated maximum-priority lane even in the case of extreme congestion and the fabric itself never becomes unstable because the switches aren’t able to communicate between each other.
Brocade Virtual Channels inside an ISL
VCs enable the isolation, categorization, protection and prioritization of storage flows so that congestion events affecting one or some of them don’t affect all of them. If a storage device becomes unresponsive and turns into a slow-drain device — a source of congestion in the network — its traffic flows start to experience increased latencies. If not acted upon, this will create ‘back-pressure’ on the network — something that is inherent to any flow-controlled network that cannot allow for frames to be dropped when there is congestion — and could potentially affect multiple unrelated ‘victim’ flows, impacting their application performance significantly and potentially causing serious business consequences. Brocade Fibre Channel fabrics can automatically detect these increased latency conditions and ‘quarantine’ slow flows into low-priority VCs so that their performance degradation doesn’t affect other storage flows.
Virtual Channels technology has been available since our first-generation ‘Stitch’ ASIC that ran at 1 Gbps and will continue to be available in our next-generation ‘Condor 5’ ASIC that will support Gen 7 Fibre Channel running at 64 Gbps and 256 Gbps. What we have done over the years is increase the number of VCs that are available for end-user traffic as well as develop software features that better take advantage of this technology, like Slow-Drain Device Quarantine (SDDQ) or Quality of Service (QoS).
As soon as we started taking storage outside of the servers and it was no longer connected to the CPU via internal buses, as an industry, we started to realize just how important it was to guarantee that an application’s or operating system’s storage resource never became unavailable. I’m pretty sure you know what happens when you disconnect a computer’s hard drive — or SSD these days — while it’s running. Then you can imagine the consequences when this happens to thousands of VMs that are running off of a SAN-attached array, or to a mission-critical application if its database becomes unavailable during operation. Not only will the application or the entire operating system of the server or VM crash, but there could easily be data corruption that would lead to extended periods of downtime with nefarious business consequences, including going out of business.
For this reason, we started to develop technologies and best practices to ensure that there was never any single point of failure (SPoF) in a networked storage environment. In addition to redundant disks drives inside the storage arrays with data mirrored or striped across multiple drives to be able to withstand individual drive failure — a technology known as RAID — redundant controllers on the storage arrays paired with redundant adapters on the servers and a multi-pathing driver on the operating system, organizations started to deploy what has come to be known as ‘dual fabrics’, that is, two separate, completely air-gapped, no-single-cable-between-them storage fabrics so that any
failure event on one of them could not, under any circumstance, ever
, affect the other fabric. This is only possible through complete physical isolation of the two fabrics, and that is why it is of utmost importance that these fabrics are physically air-gapped. Otherwise, no matter how much redundancy you build into it, you still have a single fabric, and therefore you have a potential single point of failure. Cynics would claim — and yes, I have heard this argument being made — that this was all a ploy by greedy Fibre Channel vendors to convince customers that they needed to buy double the equipment. Of course, this argument is quite ridiculous and can be easily refuted by showing that two different types of redundant networks can be built with exactly the same number of devices, as in the diagram below.
Redundancy in Fibre Channel (left) and in Ethernet/IP (right)
Does this mean that Ethernet/IP networks are not highly available? No. Does this mean that Ethernet/IP networks cannot be built with dual, air-gapped fabric redundancy, and therefore can never be as highly available as
a dual, air-gapped Fibre Channel storage network? In a way it does. Remember that Ethernet/IP networks need to support a wide variety of use cases, the primary one being supporting TCP/IP communications between applications, between clients and servers, and between devices inside the data center and the outside world, or between a company’s campus network and the internet. Redundant links between Ethernet switches are based on LAG — remember there can be no loops — which in its inception could only work between a single source switch and a single destination switch. Over the years technologies like MLAG — not really a single technology or standard, rather a myriad of vendor-specific proprietary implementations — were developed to overcome this limitation, but they are based on making two switches behave as a single one, which requires them to be configured to do so by generally complex and laborious configuration steps, and also to dedicate links between them for heartbeat and synchronization purposes. Similarly, redundancy at the IP layer between an end device and the network requires two adapters on the device to ‘team up’ — this is in fact called ‘NIC teaming’ — and behave like one, and actually present a single IP interface with a single IP address to the network. Technically, you could
run redundant air-gapped networks in Ethernet/IP if you had a dedicated network just for storage, but that is hardly ever the case. Remember that one of the arguments for using Ethernet/IP for storage is precisely that you’re not supposed to require dedicated infrastructure.
Redundancy in IP at the server level — NIC teaming
Similarly, specialized storage-focused performance monitoring, analytics, and troubleshooting tools don’t exist for Ethernet-based storage networks not because Ethernet is inherently inferior to Fibre Channel and these kinds of tools cannot possibly exist for it, but because there simply hasn’t been enough demand from customers or — as it usually follows — enough R&D time and money spent by vendors to develop said tools. Does this mean they can never exist? Of course not. But how likely is it for them to be developed, given that there is no single Ethernet switch vendor that is exclusively focused on storage? Once again, Ethernet — and Ethernet vendors — has to support an incredibly wide variety of applications and use cases, and storage is only one of them, and not precisely one of the most important ones when it comes to port shipments and revenue, so it is unlikely that anyone will invest the time and money to develop them. Instead, customers will be left to use general-purpose performance monitoring, analytics, and troubleshooting tools that aren’t specifically designed for storage — and therefore don’t provide the metrics that storage administrators really need — and that are typically based on sampling the network at rates which simply aren’t enough to understand the behavior of your storage flows or, more importantly, to be able to react fast enough when something is amiss.
Congestion in a network is not all that different than congestion in a road
Similar things could be argued about how well Ethernet-based storage networks deal with the coexistence of thousands of storage flows, how they deal with congestion and backpressure, based on the flow control mechanism being used — whether it’s PFC alone, PCF in combination with ECN, or TCP with or without ECN. In other words, how reliably they can deliver storage flows from source to destination. Attempts at improving this are definitely made every now and then in the Ethernet space, like when DCB was developed to support FCoE, or the current DCTCP (Data Center TCP) initiative, which consists of new enhancements to ECN and TCP to improve exactly this in datacenter environments. Ethernet could even adopt buffer-to-buffer flow control if needed — and if rumors are to be believed, this was proposed by a vendor (guess who?) when the industry was working on DCB and FCoE, but the proposal was rejected. But the reality is that little R&D time and money is spent on storage use cases for Ethernet, as they remain a drop in the bucket of the Ethernet market.
When it comes to performance, we have recently proven with a third-party validated report
that Ethernet/IP-based storage technologies — iSCSI in particular — simply can’t take full advantage of the performance of modern all-flash storage arrays, or even fully utilize the network's nominal link bandwidth. If you are going to spend significant amounts of money in a fancy, high-performance all-flash array, I’m sure you’re going to want to be able to take full advantage of your significant investment.
The question then becomes, how much high availability is enough? How much does a single percentage of application downtime cost me? Obviously, not all applications are the same. They don’t all require five-nines of availability, and there can be many use cases for which Ethernet/IP and the redundancy it is able to provide is good enough. How much performance is enough? Clearly, not every application you run is going to require the highest levels of performance, or microsecond-level response times. There will be many applications for which Ethernet/IP and the performance it is able to provide is just good enough. The same can be said about reliable delivery and congestion tolerance. In summary… how good is good enough?
There is no easy answer to this question, and—in the vast majority of cases—the answer is “it depends”. Each customer is going to have to decide what qualifies as ‘good enough’ for their applications, and there won’t be a one-size-fits-all answer anyway. Some of the services they will offer will require the highest levels of availability, some will require the best performance possible, and for many others it will be all about cost-effectiveness and not necessarily performance. What is important is that customers understand what each technology can offer so they can make well-informed decisions and choose the right option for each of their applications or services—without half-truths
. And it doesn’t have to be a single solution for everything; it is perfectly possible for more than one storage infrastructure solution to coexist in a customer’s data center, each offering what they are best for.
I hope that, after reading these blog posts, you are better prepared to understand the unique features that Fibre Channel brings to the table and how they compare with what other storage networking protocols or alternative storage infrastructure technologies offer, so you are better prepared to make the right decision for your business. If I have helped a single customer make the right call, I will consider this effort successful.
In the final entry in this series I will talk about what I believe is still a bright future for Fibre Channel, as it continues to be positioned as the gold standard for storage for the most demanding applications in the most demanding data centers of the world.
Until then, if you want to learn more about some of the tools and capabilities that make Brocade Fibre Channel technology the best to transport storage flows, please check out the following documents:
If you missed the previous entries into this series, make sure to check them out here: