For more details, please see ourCookie Policy.

Mainframe Solutions

The Performance Information Gap Paradox

by Dr.Steve.Guendert on ‎02-01-2012 02:53 PM (8,701 Views)

Happy February??  I thought perhaps I missed two months,  I never would have thought it would be 60 degrees (Fahrenheit) in Columbus, OH on February 1st. Anyway....

In my last blog post on 11 January 2012 I wrote about "Performance, Performance, Performance".  Today I am going to continue on that theme.

System z I/O technology has made significant advances over the past five years, from the mainframe itself ( processors, STIs, busses, channels), to the FICON directors, to the control unit.  Speeds and feeds get continually faster and new technologies have emerged such as MIDAW, HyperPAVS, and z High Performance FICON (zHPF). There are even more changes on the horizon.  Changes have occurred also in SMF/RMF and understanding these has become crucial to understanding how to manage the performance of your 2012 mainframe I/O and storage performance.

Now I could be wrong (please don't tell Mrs. Guendert I admitted that) but it seems to me with the significant advances we have seen in mainframe/mainframe storage I/O technology over the past five or so years, we have seen a diminished understanding of that technology and how to use it.  More importantly, it seems to me we have seen a diminished understanding of the performance and/or performance management of this technology.  I do not know if this is due to the decreased emphasis on performance/capacity planning in Computer Science curricula, so new professionals entering our profession do not have the training.  Perhaps it is due to less importance being placed on these topics due to the decreasing prices of hardware.  Or, maybe those of us in the vendor world are not doing as good a job as we could on educating customers on these topics as they pertain to our hardware and software.  More likely it is a combination of all of the above.

The reason I bring this up is I am getting more frequently involved in performance troubleshooting scenarios.  As the component in the middle, a FICON director typically gets the blame for a performance problem in a mainframe environment.  If its a cascaded FICON environment, the first thing that typically is blamed is a buffer credit issue with a director port.  Interestingly enough, 90% of the time the FICON director is not the root cause of the problem (and it usually is not buffer credits either).  Sometimes there is nothing wrong with any of the hardware.  Good troubleshooting and problem solving does not occur.  Finally, the end users and OEMs often have a lack of understanding of the tools at their disposal, or worse yet, lack understanding of the concepts behind the tools.

So you're saying "great Steve, so what are you going to do about it?  How can we learn more?  Glad you asked.

1) First, my good friend and IBMer Dennis Ng and I are putting together a 2 day education seminar focused solely on FICON performance and performance management.  Some of you may remember I used to do a similar one day workshop on FICON performance.  This will be a new and enhanced offering. We will be going to China later this month to roll out the first of these.  We're looking to do these worldwide.  If you are interested in these please email me.

2) If you attend SHAREnext month, make certain you attend my friend and Brocade colleague David Lytle's session on FICON performance.

3) Dave Lytle and I also teach the Brocade FICON certification course worldwide, at our customer and OEM partner locations.  We do this free of charge.  A good amount of the course is spent on performance related topics. Again, email me if interested in learning more about the course.

4) Consider looking into joining the Computer Measurement Group (CMG).  They have regional meetings in many cities.

5) I am starting to work on a series of articles/papers that will briefly review the System z I/O technology advances of the past five years,  discuss what you need to know with the below records, and how to use them together to successfully resolve performance issues in your I/O environment.  I will be developing presentations to accompany each article.

1)      FICON Director Activity (SMF 74-7)

2)      Channel Path Activity (SMF 73)

3)      I/O Queuing Activity (SMF 78-3)

4)      Enterprise Disk Systems Statistics (SMF 74-8)

5)      Device Activity (SMF 74-1)

        That's all for this time.  Due to the phenomenal readership of our blog (thank you!) we are launching a full blown Brocade Mainframe Solutions Community page to go with it.  That will be a great place to get some more of this information.  Watch for that February 17th.  Finally, please consider becoming a member of our Brocade Mainframe Solutions Facebook Page.  We just launched that two weeks ago and we're having lots of great conversations there.  Go to the page, and ask to become a member.  Yours truly is the admin. Finally, please consider following me on Twitter.


         Dr. Steve

         Twitter:  @DrSteveGuendert

on ‎02-01-2012 04:26 PM

Nice piece, Steve, Stuff I don't see much. Thanks.--Alan

on ‎02-01-2012 11:16 PM

Hi Steve,

Great topic with some very valid points.

Its amusing how often BB credits get raised as an issue and how rarely (except in long distance solutions) it is actually a problem.

Love your work on the blog, keep the articles coming.

on ‎02-03-2012 11:05 AM

Hey Dr. Steve,

Always nice to see someone highlighting the need for more understanding z/OS I/O performance! Without access to the data, the rest of the IT infrastructure doesn't really matter.   Thank you for highlighting this need.

I think the education problem is a combination of people being so busy (staff levels have decreased a lot in the last 15 years) together with the fact that the I/O bottlenecks during that timeframe have shifted locations, and the reporting/measurements have not.   Queuing doesn't really happen on the host anymore due to FICON vs. ESCON, PAV, Multiple Allegience, etc. 

Queing delays still happen of course, but it is often now due to over-utilization of the components inside the storage system - either on the host adapters on the front-end, or the actual physical disks on the back end not being able to keep up with the staging/destaging requests to/from cache on the storage array.  

To become proactive in avoiding I/O problems instead of reactive after the problem already affected production applications, you have to monitor the utilization levels of those components where the bottlenecks are originating.   That is what IntelliMagic does either through software, or now on a services basis as well.

So anyway, three other resources I'd like to mention for your readers that are interested in this area:

1.  5 Day class on z/OS I/O Architecture and Performance Analysis in April in Florida.   Taught by Dr. Gilbert Houtekamer - co-author of MVS I/O Subsystems.    The class has additional new content and is now a 5 day class.   Not free (but we can make a special alumni rate available).   More info including a detailed agenda here:

2.  You mention a session at SHARE in Atlanta in March 2012.  At that time we will have a new IntelliMagic employee that some people will recognize, Lee LaFrese, that will give a session on Thursday entitled "Best Practices for Mainframe I/O SLA and Efficiency Optimization"

3.  There is also a z/OS Storage Performance Managed group on LinkedIN that has over 300 members now.   Good place to ask questions, etc.

Best regards,

Brent Phillips

on ‎02-03-2012 12:03 PM

Thank you Alan!

on ‎02-06-2012 01:33 AM

It is quite frequent issue that mainframe is slow in performance and not working properly. But your answer to all yhese question is quite appreciable. Thanks to you for your wonderful suggestion.

on ‎01-26-2013 07:32 AM

Very useful information. I was very pleased. Thanks