08-03-2011 03:03 AM
I am a newby to the storage envionment and need some information regarding error details on Brocade DCX switches
I was asked the question "Demonstrate what SAN switch port error counters need to be checked"
It seems to be the FC Ports Explorer, on Port Statistics, under the Error Details tab.
What are the recommended values that need to be monitored/concerned about as an industry standard
Loss of Sync
Loss of Signal
Invalid Transmitted Word
Inbound Link Reset
Outbound Link Reset
Inbound Offline Sequence
Outbound Offline Sequence
Help is much appreciated
08-03-2011 12:28 PM
First of all welcome to the storage SAN world.
A recommendation is that you take some trainings regarding FC SAN.
A simple statment all error counters have to be zero. Only reboots of servers or storage arrays are an exception for a very short time.
During reboots or linkresets you will see errors. This is normal. You need an tool which is doing a correlation between error counter and reboot or HBA resets to filter the "normal" errors out.
At this point you will get maybe problems to come to a simple rule of thumb. You need a good monitoring and some history data which you have to collect.
I hope this helps,
08-04-2011 08:20 AM
The SAN Health Monitor Tool by Brocade is a good way to get stats on your environment. Just thought I would toss that out there....check it out at least if you are not currently using it. I have used the tool for the last 2 years and it has provided a really good baseline of my Brocade switch environment.
08-04-2011 08:55 AM
You are right that Brocades SANhealth is good for a baseline.
But I assume that the question goes in the direction of a daily monitoring in combination with fabricwatch to setup a alerting.
For this case SANhealth is useless because it is yust a snapshot but does not tell when errors happend which is important for fixing issues.