Fibre Channel (SAN)

Reply
Occasional Contributor
Posts: 15
Registered: ‎01-22-2010

Strange errors on SAN with 4100,200E - enc out, congestion ?

Hi all,

We have a SAN that is showing some strange errors with hosts connected to a Sun 6920 array, across a core / edge design of 4100 and 200E switches. Fairly simple config, with Fabric A and Fabric B. Firmware on switches is 5.1.0a (yes I know it's a long way behind!)

We see

  • High levels of (well we assume they are high as the SANHealth report marks them blue)
    • Link Fail
    • IosSync
    • IosSig
    • Enc Out
  • A particular problem with a Solaris host where when Fabric B is connected to the host, it starts losing access to disks and paths, which sometimes come back and then go again
  • SCSI partiy errors on hosts

Due to the various problems we are investigating wether or not we are suffering from general congestion on the SAN which from hunting around could be causing these errors.

We've run portstatshow on the ISL ports and see some increases in the tim_txcrd_z, but no other problems (no CCR errors). The following shows the output for one ISL port taken 1 minute apart - but we don't know if this is high or not - or of even a concern.

BTL_FCB_ED2:admin> portstatsshow 16
stat_wtx                341640232   4-byte words transmitted
stat_wrx                32432004    4-byte words received
stat_ftx                105051623   Frames transmitted
stat_frx                201932472   Frames received
stat_c2_frx             0           Class 2 frames received
stat_c3_frx             201930965   Class 3 frames received
stat_lc_rx              846         Link control frames received
stat_mc_rx              0           Multicast frames received
stat_mc_to              0           Multicast timeouts
stat_mc_tx              0           Multicast frames transmitted
tim_rdy_pri             0           Time R_RDY high priority
tim_txcrd_z             106810247   Time BB credit zero
er_enc_in               0           Encoding errors inside of frames
er_crc                  0           Frames with CRC errors
er_trunc                0           Frames shorter than minimum
er_toolong              0           Frames longer than maximum
er_bad_eof              0           Frames with bad end-of-frame
er_enc_out              0           Encoding error outside of frames
er_bad_os               0           Invalid ordered set
er_c3_timeout           0           Class 3 frames discarded due to timeout
er_c3_dest_unreach      0           Class 3 frames discarded due to destination unreachable
er_other_discard        0           Other discards
er_zone_discard         0           Class 3 frames discarded due to zone mismatch
er_crc_good_eof         0           Crc error with good eof
er_inv_arb              0           Invalid ARB
open                    0           loop_open
transfer                0           loop_transfer
opened                  0           FL_Port opened
starve_stop             0           tenancies stopped due to starvation
fl_tenancy              0           number of times FL has the tenancy
nl_tenancy              0           number of times NL has the tenancy
zero_tenancy            0           zero tenancy

BTL_FCB_ED2:admin> portstatsshow 16
stat_wtx                443285888   4-byte words transmitted
stat_wrx                337748080   4-byte words received
stat_ftx                105309380   Frames transmitted
stat_frx                202588123   Frames received
stat_c2_frx             0           Class 2 frames received
stat_c3_frx             202586609   Class 3 frames received
stat_lc_rx              850         Link control frames received
stat_mc_rx              0           Multicast frames received
stat_mc_to              0           Multicast timeouts
stat_mc_tx              0           Multicast frames transmitted
tim_rdy_pri             0           Time R_RDY high priority
tim_txcrd_z             106940120   Time BB credit zero
er_enc_in               0           Encoding errors inside of frames
er_crc                  0           Frames with CRC errors
er_trunc                0           Frames shorter than minimum
er_toolong              0           Frames longer than maximum
er_bad_eof              0           Frames with bad end-of-frame
er_enc_out              0           Encoding error outside of frames
er_bad_os               0           Invalid ordered set
er_c3_timeout           0           Class 3 frames discarded due to timeout
er_c3_dest_unreach      0           Class 3 frames discarded due to destination unreachable
er_other_discard        0           Other discards
er_zone_discard         0           Class 3 frames discarded due to zone mismatch
er_crc_good_eof         0           Crc error with good eof
er_inv_arb              0           Invalid ARB
open                    0           loop_open
transfer                0           loop_transfer
opened                  0           FL_Port opened
starve_stop             0           tenancies stopped due to starvation
fl_tenancy              0           number of times FL has the tenancy
nl_tenancy              0           number of times NL has the tenancy
zero_tenancy            0           zero tenancy


We're a bit stuck on where to start and any pointers are much appreciated.

I've attached the supportshow from the switches for information.

External Moderator
Posts: 4,973
Registered: ‎02-23-2004

Re: Strange errors on SAN with 4100,200E - enc out, congestion ?

Hi,

try another cable / SFP's.

That show some component become as defective.

let me know.

TechHelp24
Super Contributor
Posts: 260
Registered: ‎04-09-2008

Re: Strange errors on SAN with 4100,200E - enc out, congestion ?

>>Firmware on switches is 5.1.0a (yes I know it's a long way behind!)

Are you sure your hunting for problems in the right SAN??

On the 4900 the firmware is 6.4.0b and on the 200E its 5.2.1b

Super Contributor
Posts: 635
Registered: ‎04-12-2010

Re: Strange errors on SAN with 4100,200E - enc out, congestion ?

Hi,

did you notice that the switch 10.160.230.99 has a problem:

2010/10/05-15:23:29, , 844, FFDC | CHASSIS, WARNING, SilkWorm4900, Detected termination of process cald0:17645
2010/10/05-15:23:29, , 845, CHASSIS, INFO, SilkWorm4900, First failure data capture (FFDC) event occurred.
2010/10/05-15:23:35, , 846, CHASSIS, WARNING, SilkWorm4900, Trace dump available ! (reason: FFDC)

please run a supportsave and remove all pending dump files on the switch.

CalI your switch support company to clarify the reason why cald crashes.

I hope this helps,

Andreas

Join the Community

Get quick and easy access to valuable resource designed to help you manage your Brocade Network.