SAN Health Utility

SAN Health - Understanding the Excel Report

by Moderator ‎04-23-2012 01:20 PM - edited ‎08-10-2016 11:49 AM (5,456 Views)

For a list of SAN Health topics, please see SAN Health Help

4.0

 

The SAN Health spreadsheet contains all the diagnostic data from the audit performed.  The spreadsheet is listed out as a summary of devices, a SAN summary, a fabric summary, switch summaries as well as performance graphs and data.  (The performance graphs are for Brocade FOS products only.)  Most of the terms and stats in the Excel report will be familiar to experienced SAN admins and do not require SAN Health-specific explanations.  For those concepts that may need more explanation, we've answered the most common questions below.

 

What are Hanging Zones?

A zone entry that has been defined in the zone configuration set, where one or more of its members are not physically connected to the fabric that was included in the switches audited by SAN health is considered hanging. Hanging zones are flagged to assist in the clean up of legacy zones where the included devices no longer exist on the fabric. They are also flagged to bring to your attention devices which may have dropped off the fabric unexpectedly.

Another way to explain hanging zones (or hanging configs, or hanging aliases) is to consider them the opposite of an unzoned device.  An unzoned device is one that exists on the fabric, but is not defined in any zones.  A zone that has defined device that is not actually present in the fabric is a hanging zone.

 

How are the names of the devices determined in SAN Health?

SAN Health attempts to map the Brocade Zone Alias name or the M-Series Port name to the actual device, and then uses this name in the report and Visio diagram. If no Zone Aliases are present, you have the option of creating a comma-separated-value data file containing this information and loading this CSV file into SAN Health.

 

Why are some of the values in my report labeled "N/A" or "Not Reported"?

The Not Applicable (N/A) or Not Reported value is placed in the report for values that are genuinely not applicable (such as slot information for a non chassis-based system) and for values that cannot be retrieved through the firmware level running on the switch.  For devices such as HBA's, you may see the firmware or driver levels reported for some of your devices but not for others.  This is due to many older, and some newer, HBA's that don't report this data to the name server when registering with the fabric.  SAN Health only audits switches, it doesn't directly communicate with end devices so it can only report on information that the switches have.

 

How do I create custom graphs using the captured performance data?

(Capturing performance is only possible on FOS-based Brocade switches.  M-Series and Cisco switches do not present reliable data over telnet and SSH.)  First some background information: All performance data collected by SAN Health is stored in the report on hidden Excel worksheets. By default, a graph is created for each switch port in the Switch Details section, all based on this collected data. It is important to realize that a historical performance graph will be generated for EACH switch in the report. Thus, if you have 20 switches with captured performance data, there will be 20 total graphs. At times, it might make sense to create your own custom graph. The following procedure provides an example of how to do this:

 

From the Excel menu, select Format -> Sheet -> Unhide… The Unhide window appears and displays all of the hidden sheets as shown.

 

ebx_857754584.gif

 

Select the sheet to unhide. The names of the hidden performance data sheets begin with D_ and have the switch name included as part of the sheet name as shown above.

 

After you have clicked on "OK", the hidden sheet displays as shown in the partial output below. The data presented is the captured output from portPerfShow and is in megabytes.

ebx_1684569454.gif

 

On this sheet, select the columns you want to graph.

 

Next, run the Chart Wizard from the Standard Toolbar under the Excel main menu. As the wizard progresses, you will select a chart type, label your X and Y axis, define custom properties, and so on. Many options exist within Excel, providing highly flexible methods for viewing the data.

 

Tip: Experiment with various page sizes and object locations until you get the detail you need. If you have questions, ask an Excel expert, use the built-in Help, or consult Microsoft’s web site.

 

Excel prints the report one sheet at a time. What am I doing wrong?

Excel defaults to printing at the individual worksheet level. The SAN Health report is compiled as a complete Excel workbook with the Table of Contents worksheet using page numbers that are only correct when the entire workbook is printed. When printing a hard copy, ensure that you select the "Entire Workbook" option.

 

ebx_-97168637.gif

 

 

I see several items highlighted throughout the report.  What are the thresholds used when determining highlights?

Here are the highlighting ratios:

     Host : Disk

          Red 60:1

          Orange 40:1

          Blue 20:1

     Non-ISL : ISL

          Red 100:1

          Orange 50:1

          Blue 30:1

     Device : ISL

          Red 100:1

          Orange 50:1

          Blue 30:1

 

SAN Health considers anything that is not a switch, ISL, or empty port as a "Device". For the ratio "Non-ISL : ISL", a Non-ISL is every other port including devices and empty ports.

 

These highlighted values don't necessarily mean that there is a problem associated with the ratio.  They are simply there to bring your attention to an area that you might want to investigate to better understand if it is by-design or not.  Please check with your SAN designer or current SAN support provider for clarification.

 

Other highlighing thresholds:

     For BW % usage on ISLs:

          > 90%  high

          75 to 90% – medium

          60 to 75%  low 

The same thresholds apply for the port map table on the fabric sheet.

 

For the ISL BW utilization on the summary sheet, any ISLs with peak usage over 75% are alerted low.

 

In the Frame Error Counts section on the switch sheet, whenever a specific frame error is 10% or more of the total Tx and Rx count, we highlight it to bring it to your attention. It may or may not be an issue and if you are concerned with the number, please seek advice from your service provider. For a brief description of what each error is, please click on the header and look in the Excel formula bar.

 

On the zone tabs, there are a few highlights:

-Green highlights are for lsan zones, just to make them stand out from other zones. 

-Hanging aliases, zones and configs get a low (blue) highlight

-If a zone has more than 20 members, it is alerted as low

-For the % of zone database used,, 95%, 85% and 75% get highlighted high, medium and low, respectively. 

 

I see Fabric Watch listed as "Disabled" on my switch. How is this determined by the SAN Health report generators?

This feature is determined by the output from fwAlarmsFilterShow. If no filters are set, Fabric Watch is determined to be disabled.

 

I see FICON listed as "In Use" on my switch. How is this determined by the SAN Health report generators?

This feature is determined by the output from "ficonshow rnid". If there are current/valid entries in the output from ficonshow rnid, the FICON is determined to be In Use.

 

I want to be able to determine which ones of my hosts are single attached and which are dual attached. Can I do this with SAN Health?

Another way to ask this question would be “How can we identify single-path hosts using the switch CLI?” since SAN Health only “knows what the switch knows”. Usually, there is no way to determine which hosts have which HBAs installed via the SAN switches. Some HBAs will include their Host’s hostname and/or IP address in the SCSI string that is registered with the switch during the FLOGI, but most seem to not have this feature, or they aren’t configured to use this feature. If all the HBAs in a given SAN Health report have this feature turned on, it would be relatively easy to search the data and find all the matches, and then find all the hosts that only show up once. Look on the SAN Ports tab in the Excel report under the heading Additional Information From HBA to see if your HBAs are reporting their hosts' info. If not, please check with your support provider to see if this can be enabled.

 

Another way to get this data into a SAN Health report would be for an end user to supply a CSV file that contains all their hostnames/IP Addresses/Other Unique Identifier along with their HBA’s WWNs. (Please see Options > Device Names in SAN Health to see further instructions for saving and loading this file.) But, if the customer already has that info, then they can probably just search that file and come up with a list of Single Path Hosts.

 

 

For questions not answered here, please write to SAN Health Admin

Contributors