Data Center

Brook Reams

VCS Technology and Equal Cost Multi-Path for Layer 2

by Community Manager on ‎12-21-2010 11:32 AM (122 Views)

In a previous blog about Brocade Trunks, I said I'd post more about how VCS Technology implements Equal Cost Multi-path (ECMP).  ECMP has been associated with Layer 3 and with VCS Technology is now available for Layer 2 networks.  But, what is ECMP and why does it matter?  In this diagram I’m showing a simple example.

ECMP-Diagram-1.JPG

Let’s look at the two VDX switches labeled VDX-RB1 and VDX-RB2.  You will notice two paths are shown, one labeled “30G” and the “10G”.  Multiple inter-switch links (ISL) between two VDX switches will automatically form a Brocade Trunk as I described in this blog post.  All that’s required are that the ports be in the same hardware port group.  A Brocade Trunk can have up to eight 10 GbE ISL links per Brocade Trunk.

In the diagram, the thicker path between switches VDX-RB1 and VDX-RB2 is a Brocade Trunk and the “30G” label indicates this Brocade Trunk has three ISL connections providing 30 Gbps of bandwidth.  The second path is a single ISL indicated by the “10G” label indicating this path has 10 Gbps of bandwidth.  There are other links from VDX-RB1 and VDX-RB2 to a third switch, VDX-RB3 and each of these are single ISL connections as indicated by the “10G” label.

From the standpoint of ECMP for traffic flowing between Server-1 and Server-2, there are two shortest paths available since the Brocade Trunk is treated as a single logical path, even though it contains multiple physical links.  In classic Ethernet with STP or LAG, only one of these two paths can be active at a time to prevent forwarding loops.  But, with VCS Technology, both of these paths are available by default since link state routing designed to work at layer 2 is used to eliminate forwarding loops.

In the VCS implementation of an Ethernet fabric, the path cost is not determined by the bandwidth of the path, but by the number of switches in the path.  Each connection between switches is commonly referred to as a hop.  In the diagram, there are paths with one hop between VDX-RB1 and VDX-RB2 and paths with two hops by way of VDX-RB3.  The path cost of the two hop paths is twice as much as the path cost of the one hop paths, so they are not the lowest cost paths for traffic flowing between Server-1 and Server-2.

To decide which path to use for a particular flow, ECMP uses a seven tuple hash to ensure uniform flow balancing across all the lowest cost paths. See Chip Cooper's summary of hashing to appreciate how IP and FCoE traffic are handled by the VCS Ethernet Fabric hashing algorithm.

-------------Updated on 2013-10-24------------

NOTE: There has been some confusion about how the VCS Fabric hashing process works when equal-cost paths have different bandwidth, e.g., the 10 GbE and 30 GbE equal-cost paths between VDX-RB1 and VDX-RB2 above. Some have concluded (incorrectly) that traffic allocation to equal-cost paths with different bandwidths ignores the difference in bandwidth.  Well, no, that's not what happens


Here is a nice summary of how the bandwidth on equal-cost paths affects traffic allocation in a VCS Fabric.

--> http://packetpushers.net/understanding-brocades-isls-and-ecmp-just-a-wee-bit-more/

In Summary:
The available bandwidth of each equal-cost path is considered so flows are intelligently allocated for full bandwidth utilization.

----------------------------------------------------------

Once a flow has been allocated to a path, Brocade Trunking optimizes use of the physical links in that path.  It uses hardware to stripe frames across all the links in the trunk so it achieves very high link utilization rates.  Classic Ethernet LAG relies on software hashing to determine the link for a flow and for this reason, cannot achieve very high link utilization rates.  Brocade Trunks achieve near 100% utilization of the links.

Let’s assume someone accidentally unplugs one of the cables in the 30G Brocade Trunk.  What happens now?  I show that in the following diagram where the trunk is now labeled 20G since the available bandwidth is 20 Gbps.

ECMP-Diagram-2.JPG

Well, any frames in flight on that ISL are lost, but any frames in flight on the remaining two ISL links are unaffected.  The Brocade Trunk is aware there are only 2 links now and stripes succeeding frames across the remaining links without having to halt traffic. And, when that cable is plugged in again, the Brocade Trunk immediately starts striping frames across all three links again.

Okay, what will happen if the 30G and 10G paths between VDX-RB1 and VDX-RB2 are removed?  I show that in this diagram where no direct connection between VDX-RB1 and VDX-RB2 are available.

ECMP-Diagram-3.JPG

In that case, ECMP determines that the least cost path for this traffic is the two hop path through VDX-RB3 and automatically forwards traffic on the two hop path.  Flows are allocated to the paths between switches using the seven tuple hashing algorithm so once again; uniform flow balancing is achieved on the available paths.  Should one or more of the links between VDX-RB1 and VDX-RB2 be established again, then ECMP automatically uses them for traffic between Server-1 and Server-2 since those links are the least cost path for those traffic flows.

I wanted to repeat again that the features of a Brocade Trunk and ECMP I have described are automatic and require no configuration by the administrator.  This really reduces management complexity.

I hope this helps explain how an Ethernet fabric using VCS Technology provides automatic path selection using ECMP and automatic frame striping across multiple links in a Brocade Trunk.