Synopsis: An overview of high-availability stacking features in Brocade FCX and Brocade ICX Series of switches including best practices when deploying stacking in Brocade’s HyperEdge Architecture for campus networks.
Brocade sets the industry standard for resiliency and high availability by enabling non-stop networking in data center and campus environments. The Brocade FCX and Brocade ICX Series switches support switch stacking with features that deliver continuous uptime for campus networks. Hitless stack failover is a high availability (HA) feature used in networks that must reduce downtime. It provides the following benefits:
This document discusses Brocade’s hitless stacking feature and best practices for achieving high-availability with stacking in the Brocade HyperEdge Architecture for campus networks.
Network architects and designers who want a better understanding of how to design and apply mixed stacking to wired and wireless campus networks.
The following documents are valuable resources for the designer. In addition, any Brocade release notes that have been published for the FastIron operating systems should be reviewed.
Brocade® (NASDAQ: BRCD) networking solutions help the world’s leading organizations transition smoothly to a world where applications and information reside anywhere. This vision is designed to deliver key business benefits such as unmatched simplicity, non-stop networking, application optimization, and investment protection.
Innovative Ethernet and storage networking solutions for data center, campus, and service provider networks help reduce complexity and cost while enabling virtualization and cloud computing to increase business agility.
To help ensure a complete solution, Brocade partners with world-class IT companies and provides comprehensive education, support, and professional services offerings. (www.brocade.com)
The content in this guide was developed by the following key contributors.
2013-04-08 1.0 Initial Release
2013-05-17 1.1 Corrected missing content
Enterprise campus network wiring closets typically contain stacks of Ethernet switches. Stacking functionality enables the linking of small-form-factor switches through short proprietary copper cables connected to dedicated stacking ports or through copper or optical high speed Ethernet links. All Brocade switches utilise Ethernet as the stacking medium. The stack of switches then appears and behaves as a single logical switch, simplifying management and increasing resiliency. When a new switch joins the stack, it automatically inherits the operating software and configuration of the stack without requiring manual intervention.
Stacking switches provides equal value at the edge of data center networks and in campus networks. The main difference is that the switches are not physically stacked on top of each other. Instead, longer cables logically unify the switches at the top of each server rack. For example, a row of Top-of-Rack (ToR) switches can appear as a single logical switch, significantly reducing management overhead of the data center access layer.
Brocade stackable switches are linked together using either shared or dedicated stacking ports depending on model.
Switches can be connected together in a variety of stack topologies, the most common are the “daisy-chained ring” and the “braided ring” in which alternating switches are connected to each other. Brocade’s stacking technology also supports the use of 10 Gigabit Ethernet (GbE) XFP or SFP+ fiber optic ports which allows the switches participating in a stack to be situated farther apart from each other. The following member types make up a stack:
The LEDs on the front of the switch make it easy to identify members of the stack. On FCX switches the LED is labeled AS and on ICX switches it is labeled MS. The stacking configuration is indicated as follows:
NOTE: If a Brocade switch is configured as a standalone unit, meaning the stacking protocol is disabled, it will not function as a member of a stack and will operate independently even if it is connected to other switches in a stack.
Common Stacking Topologies
For full details about stack topologies withrecommended cable layouts please refer to the hardware installation guide for each switch type.
Because Brocade switches use Ethernet for the inter-switch stack connections the deployment options are greatly increased. If standard copper stacking cables are used then the inter-switch connections can be up to 5 meters long which is usually sufficient for locally distributed stacks such as in ToR applications. For broader distribution fiber-optic cables should be used and this allows a stack to be deployed across multiple physical locations such as the wiring closets of an office building. The table below shows the approved optics and stacking distance combinations.
Connectivity Options for Stacking with Brocade FCX and ICX Series Switches
By using stack connections to link distributed switches together rather than standard inter-switch links with Layer 2 STP or Layer 3 routing, several significant advantages can be realized;
Hitless stacking is supported on Brocade FCX and ICX Series switches. It is a high-availability feature set that ensures sub-second or no loss of data traffic during the following events:
During such events, the Standby Controller takes over the active role and the system continues to forward traffic seamlessly, as if no failure or topology change has occurred.
The following hitless stacking features are supported:
Hardware or software failures can take a device offline and potentially disrupt the entire network until the issue is resolved. Hitless failover reduces device downtime by utilizing active and standby controllers (switches) within a switch stack. When an active (master) controller fails unexpectedly, the standby controller automatically takes over and becomes the active controller. This failover process is “hitless,” meaning that it occurs with zero downtime and no interruption of L2/L3 network services. Furthermore, in the event a switch needs to be taken offline for maintenance or repair, this process can be performed manually via hitless switchover.
Hitless recovery is also triggered in the event of a stack link failure, for example if a stack cable was removed or accidentally damaged.
To understand how hitless failover occurs within a Brocade switch stack it is important to consider:
Switchover between controllers is managed manually using the CLI or automatically without reloading the stack configuration and without any packet loss to the services and protocols within the stack. Switchover is a planned change of assignment of the Active and Standby controllers in a stack.
Failover is the automatic, or forced, switchover between the Active and Standby controllers. The failure or abnormal termination of the Active controller triggers hitless failover. In the event of a failover, the Active controller abruptly leaves the stack and the Standby controller immediately assumes the active role. Unlike a Switchover, a failover generally happens without warning and may cause sub-second packet loss as packets traversing the stacking link at the time of failure may be lost.
The following events are supported with hitless stacking:
Ensuring that the Active and Standby controllers are synchronized is a critical component of hitless stacking. Synchronization is an integral part of Brocade’s stacking technology and is automated and transparent. For the Standby controller to take over immediately, the data and control planes must be synchronized with the Active controller. The Standby controller stores the necessary information for assuming control in its database, including spanning tree states, route information, Media Access Control (MAC) address tables, Virtual LANs (VLANs), etc.
When a stack is created and the stack member switches reboot, the Active controller assigns a Standby controller within 60 seconds. The Active controller configuration is then copied to the Standby controller through the baseline synchronization process which is completed within 70 seconds.
After the baseline synchronization is complete, the Standby controller is ready for hitless failover. The Active and Standby controllers remain synchronized in real-time through dynamic synchronization. As a result switch stacks operating with synchronized Active and Standby controllers are able to maintain system integrity when a hitless failover occurs.
The following processes ensure synchronization of Active and Standby controllers:
After the controllers are synchronized, any failure of the Active controller triggers a dynamic failover to the Standby. Typical events that will trigger dynamic failover include:
When a hitless failover event occurs, management control is transferred from the Active controller to the Standby controller with zero downtime and no Layer 2 / Layer 3 network service interruption.
In a Brocade switch stack, the stack priority number influences the role and status of each switch in the stack: active, standby, or member. If the priority number is equal, stack status is determined by the lowest Unit ID number. Hitless failover uses these determining factors when assigning and reassigning stack status during the failover process.
In the following example, hitless failover is active in a three switch stack when a failover event occurs.
Note: In order to achieve hitless failover in a stack containing only two switches you must have the same stack priority set on both devices. If you want to assign the same priority to the Active and Standby Controllers, you must do so after the stack is formed. This prevents the intended Standby Controller from becoming the Active Controller during stack construction.
The Active controller uses its MAC address as the MAC address for the entire stack. This ensures the stack is recognized by other network elements as a single logical switch simplifying management and increasing resiliency. The stack MAC address is automatically generated and is the MAC address of the first port of the Active controller which ensures a consistent MAC address across stack reboots and prevents topology changes that would result from protocol enable, disable and configuration changes. The MAC address of the Active controller is the Bridge ID for Layer 2 protocols.
If the Active controller is disconnected from the rest of the stack, the MAC address of the stack changes based on the election of a new Active controller. The causes the forwarding database to be reset creating a topology change event and a minor network outage.
Even a minor outage can be significant for critical hosts and applications such as IP phones and VDI clients. An outage can be avoided by using the “Stack persistent-mac” command to configure the stack to continue using the MAC address of the original Active Controller. The administrator can then decide when an outage is acceptable and reset the forwarding database manually to eliminate any impact on end-users.
An alternative recommendation is use of the “stack mac” command to manually set a MAC address for the stack that continues to be used regardless of the switch currently selected as the Active controller. Stack management and switch membership changes never trigger a reset of the forwarding database with the associated outage caused by topology changes.
Configuring a Brocade switch stack using Brocade’s stacking technology is a simple process.
Refer to the FastIron Configuration Guide for more details about stacking and available configuration options,.
The replacement and failback process of a failed controller is simple and hitless. The replacement controller must have the same model number as the failed controller, and the device must be running a clean configuration on the same version of code that the stack is running. After the replacement controller is added to the stack and brought online, it re-joins the stack in place of the previously failed controller.
This automated process works in the following manner:
To function correctly, every switch in a stack must use the same version of FastIron software, this ensures that all features and functions are consistent across all devices. Within a Brocade switch stack the Auto Image Copy function is enabled by default and ensures that every stack member runs the same version of software. The master image is taken from the Active Controller and is automatically copied to any switch in the stack that is not loaded with the same version.
For maximum flexibility the Auto Image Copy can be disabled but if this is done any switch that is added to a stack which is loaded with different software to the Active Controller will not function and will automatically have all its ports disabled. To bring the switch into the stack its software must be updated manually.
The Auto Image Copy feature ensures that all units in a stack are running the same flash image following events such as the addition of a switch to a stack, the replacement of a failed device or a stack merge and Brocade recommends that it should be left in its default configuration.