Seeing the Forest for the Trees: The Limitation of Spanning Trees in Emerging Data Center Environments

by Doug.Ingraham on ‎02-23-2010 11:50 AM (280 Views)

This is the first in a series of blogs I am writing to cover technical topics as the industry gears up to build next-generation data center architectures. These blogs are intended to provide a basic overview of the innovations that Brocade and the industry at large are developing.

In today's introductory blog, we take a step back with an overview of a technology that we are very familiar with in the Ethernet world—Spanning Tree Protocol (STP). If you know Ethernet networking, you know that STP has been the primary protocol in Layer 2 LANs since the very beginnings of Ethernet. Its primary goal is to create a loop-free topology that provides a single active path between any two network endpoints, by shutting down all but one path between any two switches. While this approach has served the development of Ethernet well, STP does have bandwidth, performance, and, ultimate cost and management limitations in light of the advent of converged data center architectures.

Various networking vendors have developed mostly proprietary enhancements to STP to address these shortcomings, but these are all stopgap measures to mask network-level inefficiencies. As we move forward to the virtual data center and cloud computing architectures, these vendor-specific solutions will take too much time to reconverge when a link or switch goes down or an application migrates from one server to another. With STP, data can take a path through several switches because a shorter path is disabled. During the time it takes for links to reconfigure and reconverge, the entire network is unavailable. This is unacceptable for mission-critical applications, hence the move toward "lossless" Ethernet technologies—namely Converged Enhanced Ethernet (CEE) or Data Center Bridging (DCB). (More on this in a later post.)

What is needed is to re-architect the way traffic should flow, so that it makes better use of all the available paths. This requires the building of larger, physical/logical networks (think flatter) to account for applications such as Virtual Machine (VM) mobility across interconnected hosts within a single network. Brocade is working with the Internet Engineering Task Force (IETF) on a standard called Transparent Interconnection of Lots of Links (TRILL), which provides multiple paths via load splitting. TRILL will allow us to reclaim network bandwidth and improve utilization by establishing the shortest path through Layer 2 networks and spreading traffic more evenly. The net effect is that the network can respond faster to failures. By addressing the limitations that STP poses, the data center network will be able to scale to meet the demands of virtualized and cloud computing environments with on-demand Layer 2 network capacity.

In my next blog, I will explore the benefits of flattening your Layer 2 networks.