Mainframe Solutions

A more reliable, available, scalable, simpler, higher performance (TS7700) grid

by Dr.Steve.Guendert on ‎07-10-2012 10:52 PM - last edited on ‎10-28-2013 09:09 PM by bcm1 (1,578 Views)

Prior to yours truly taking a short vacation, the community in which I live (Gahanna, OH) was struck by a severe storm on June 29.  The storm was quite destructive, leaving many without electricity for several days. Gahanna made the national news.   Fortunately, there were very few deaths/injuries.  However, the damage was extensive, particularly to the electric power transmission system/grid.  Many were without electricity for over a week, and without air conditioning in the middle of a heat wave with temperatures of 100+ degrees F.  The media started talking about how unreliable our existing electrical grid was.  In hot weather, rolling brownouts are common because our power grid is built on outdated technology with many old power poles and above ground wiring that can't handle the workload, or the wind in a storm.  Power outages are more common, and of longer duration than they were 5 years ago.  In short, our electrical grid built on outdated transmission components no longer had the reliability, availability, serviceability (RAS) and performance required by customers.

 

Kind of sounds like mainframe customers' state of the art TS7700 Grid business continuity solution that still uses now outdated Catalyst 6500 series switches/routers for the IP transmission of the cross site data replication.  Let's talk about how Brocade can help you build a more reliable, available, scalable and higher performance TS7700 Grid solution using our MLXe router.  And oh, by the way I think you will like the simplicity of the solution, particularly in managing the hardware components.

 

Background: The IBM Virtualization Engine TS7700 family is the latest IBM virtual tape technology.  It is a follow on to the IBM Virtual Tape Server (VTS), which was initially introduced to the mainframe market in 1997.  The IBM VTS also had peer-to-peer (PtP) VTS capabilities.  PtP VTS was a multi-site capable business continuity/disaster recovery (BC/DR) solution.  In a nutshell, PtP VTS was to tape what PPRC was to DASD.  PtP VTS data transmission was originally via ESCON, then FICON, and finally TCP/IP. Today, the TS7700 offers a similar functionality, known as a TS7700 Grid. A TS7700 Grid refers to two or more physically separate TS7700 clusters connected to one another by means of a customer-supplied TCP/IP network. The TCP/IP infrastructure connecting a TS7700 Grid is known as the Grid Network. The grid configuration is used to form a disaster recovery solution and provide remote logical volume replication.  The clusters in a TS7700 Grid can, but do not need to be, geographically dispersed. In a multiple-cluster grid configuration, two TS7700 Clusters are often located within 100 km of one another, while the remaining clusters can be located more than 1,000 km away. This provides both a highly available and redundant regional solution while also providing a remote disaster recovery solution outside of the region. For a more detailed, extensive discussion of the TS7700 and TS7700 Grid, please reference this IBM Redbook.

 

The TS7700 Virtualization Engine uses the TCP/IP protocol for moving data between each cluster. Bandwidth is a key factor that affects throughput for the TS7700 Virtualization Engine.Other key factors that can affect throughput include:

 

  1) Latency between the TS7700 Virtualization Engines

  2) Network switch capabilities

  3) Network efficiency (packet loss, packet sequencing, and bit error rates)

  4) Inter-switch link capabilities (flow control, buffering, and performance)

  5) Flow control to pace the data from the TS7700 Virtualization Engines

 

The TS7700 Virtualization Engines attempts to drive the network links at the full line rate, which may exceed the network infrastructure capabilities. The TS7700 Virtualization Engine supports the IP flow control frames so that the network paces the level at which the TS7700 Virtualization Engine attempts to drive the network. The best performance is achieved when the TS7700 Virtualization Engine is able to match the capabilities of the underlying network, resulting in fewer dropped packets. When the system exceeds the network capabilities, packets are lost. This causes TCP to stop, resync, and resend data, resulting in a much less efficient use of the network.  In summary, latency between the sites is the primary factor. However, packet loss due to bit error rates or insufficient network capabilities can cause TCP to resend data, thus multiplying the effect of the latency.

 

Brocade's role in TS7700 Grid solutions

 

IBM customers who have implemented, or are considering implementing a TS7700 Grid solution for their mainframe environment, are typically very concerned about reliability, availability, serviceability (RAS) and performance.  That is the primary reason why the vast majority of these same IBM customers have implemented Brocade FICON directors, such as the Brocade DCX 8510, for their mainframe storage connectivity. Also, many IBM TS7700 Grid customers previously used IBM PtP VTS.  Prior to PtP VTS using IP based replication, it used ESCON or FICON and hence the end user required channel extension technology.  This channel extension technology for PtP VTS was typically a Brocade device, such as the Brocade USD-X

 

Fast forward to the present and the current TS7700 Grid solution.  Most customers are utilizing a switch/router for the TCP/IP based data replication that is just as old  (and now outdated) technology as the USD-X, and that switch/router is the Cisco Catalyst 6500 series..  Fortunately, as many end users are quickly finding out, Brocade offers a IP switch/router, the Brocade MLXe, that offers better RAS and performance than these old Catalyst 6500s.  The Brocade MLXe offers such a high level of performance that the most data intensive organization in the world, CERN ( European Organization for Nuclear Research) recently standardized on it. This has resonated with many mainframe customers who are starting to implement the MLX for their TS7700 Grid solution.  Let's take a look at an example below.  

 

7.png

 

This diagram represents the "after" environment.  The "before" environment was much more complicated, with lower performing, older technology hardware.  The "before" environment consisted of  IBM System z9 and System z10 mainframes, older DASD and VTS.  The storage and extension network consisted of more,smaller Brocade M6140 FICON directors, stand alone legacy CNT (Brocade) Edge 3000 FCIP extension switches, Cisco Catalyst 6509s, and DWDM hardware from yet another vendor. 

 

What is not obvious from the above diagram is that the DCX-8510 FICON directors also have the Brocade FX8-24 FCIP extension blade in one or more slots.So in this solution, the customer greatly simplified their technical support in going from 3 vendors to 1 (Brocade). They consolidated hardware footprint and lowered operating costs by moving to the newer, more energy efficient hardware.  They improved the end to end performance of the entire environment, and in the process improved the efficiency of cross site network usage.  As the most expensive Total Cost of Ownership (TCO) cost component in a DR/BC solution is network bandwidth, they can expect to save additional costs in terms of their cross site network bandwidth requirements.

 

Last, but certainly not least, this new solution allowed the customer to go from using four management platforms in the "before" environment, to one management platform as you can see noted in the diagram.  Brocade Network Advisor (BNA) allows you to manage the Brocade FICON directors/switches, the FCIP extension blades/switches, and the Brocade IP switches and routers all from a single management tool-a "single pane of glass".  This makes it far simpler to manage the day to day operations of this environment, not to mention coordinate management software upgrades.

 

All in all, a nice high performance, TCO saving, clean and simple solution to protect your data. 

 

Watch for my next blog post when we'll take this solution a step further and discuss incorporating DASD replication and extension into this. As always, thanks for reading.  Feel free to comment or ask questions, or follow me and my mainframe connectivity tweets on Twitter.  My handle is @DrSteveGuendert.

 

Dr. Steve