10-24-2017 12:31 PM
I have a strage problem which made me to draft the issue here and get the similar experience if anyone has faced it.
We have two fabric as below:-
FabA:- Core Switch (brocade 4200) Where host is connected and another switch (brocade 6520) where Storage FA port connected.
We have two ISL ports connected between these two switches.
FabB:- Same setup as above.
Issue caused:- one of the ISL port went faulty and triggered number of errors on all fields including CRC. We left it as it for SFP/cable replacement. after exactly two days of the issue. One of the Host lost SAN disks connectivity.
Can this redundant ISL port cause the SAN disks lost connectivity. We then immediately disabled this one ISL port just to rule out.
But Really wanted to know the impact levels out of it.
10-24-2017 12:45 PM
there is no Brocade 4200.
do you mean Brocade 4100 ?
can you please post from command line from output "switchshow"
the Line "SwitchType"
what is the FOS installed on both Brocade 6520 and this Brocade 4x00 ????
10-24-2017 12:52 PM
Oh yes. Its Brocade 4020. The 0 port which we disabled later after the issue caused. but there is a redundant ISL connection i.e. port 15 here in this switch.
zoning: ON (Switch5Config)
Area Port Media Speed State Proto
0 0 id N4 No_Light Disabled
15 15 id N4 Online E-Port 10:00:00:27:f8:c6:ae:d3 "Fab1-Switch2" (downstream)
10-24-2017 01:05 PM
from you first Post:
--->>>We have two fabric as below:-
--->>>FabA:- Core Switch (brocade 4200)
what is the FOS release? please post both, 4020 and 6520
4020 must meet a min. required FOS 6.2.2 in order to ISL with 6520 with FOS 7.x,
FOS 8.x is not longer comaptible with FOS 6
QoS must be disable on 16G Platforms whe ISL with FOS 6.2.2e or earlier otherwise the ISL segmented
10-24-2017 01:13 PM
10-24-2017 01:31 PM
with FOS Release you are in range
when you say One of the ISL port is down, are those both Port configured as Trunking or simple ISL Port ?
is probable by simple ISL one port freeze and the Traffic was active only on the second port that become fault, and for this reason the host lost a connectivity to Storage/LUN.
10-24-2017 01:39 PM
After the ISL port became faulty and passed two days. then host lost connectivity. Its not immediately.
10-25-2017 12:15 AM
the CRC on ISL will, over time, corrupt too many frames to / from the host / storage so the MPIO software on the host will assume that the path(s) to disk is failed (probably). Will need the host logs / MPIO for troubleshooting this in detail. To further confuse the MPIO software, the name server will still contain the storage port (port is online and reachable but only partly)....
What you see is one reason for using port fencing or disabling links with physical issue as fast as possible. Also, historically, the once the degation (physical) of the links have started, it is getting worse. So, you might just have one or two CRC per hours first, but in 1-2-4 days, you will get an higher count. Another reason to disable the port, assume you have enough redunacny in your network, of course.