10-15-2010 11:46 AM
A linux admin is having issues with one HBA. He reports very large and rapidly increasing error counts in Invalid ordered sets counter and Encoding err nonframe_8b10b counter for one port of this HBA.
The other end of the connection is a Brocade 5300 switch running FOS 6.3.0b. The switch is not seeing significant errors with the connection. One thing I did notice was that the RX signal level at the switch was down over 8db, and so I offered to replace the SFP. The replacement was done and the signal level is significantly better (-4.xxdb), but he reports that the error counters are still rolling.
Based on what little I found on this sort of error I pegged the port at 4Gb. Initially that seemed to "fix" the problem, but shortly thereafter he reported that the same counters are still incrementing, only a bit more slowly now.
Any hints would be welcome. Additional information available if pertinent.
10-15-2010 01:19 PM
He is only seeing problems on the host side, but I do not know if he is displaying the counters with HCM.
The FC archetecture is two parallel fabrics, so the other port goes to another 5300, and he has active and standby paths to the same LUNs available on each fabric.
...anticipating the next question: the target storage unit is an IBM ds-5300, and it is also connected to the same pair of Brocade 5300 switches...and yes having the 5300 connected to the 5300 does make things sound more confusing than they really are.
Did IBM and Brocade really need to make completely different gear with the same model names that are frequently used together? Really?
10-17-2010 02:30 AM
how is the switch port configured?
Check the portcfgfillword settings on the switch port. Try to set it with portcfgfillword portnumber 1
THe switch will use abrff as fillword.
I hop this helps.
10-22-2010 12:03 AM
If the RX signal on the switch is weak it means either the TX side of the HBA is broke or there is a broken cable in between.
You mentioned a lot of enc errors out of frame which explains the invalid ordered set. This is a physical issue. Since you already replaced the SFP on the switch with marginal improvement I would suggest to replace the cable (and check any intermediate connections like patchpanels etc.)
Cleaning of the connection points also sometimes helps