Fibre Channel (SAN)

Reply
Occasional Contributor
Posts: 6
Registered: ‎11-28-2016
Accepted Solution

need help for some switchlog explanation

hi,

We need help for fc link failure issue, thanks a lot in advance.

there are Qlogic HBA card on both target nodes and hosts, and they are all connected to brocade switches.
we created luns and mapped them to all hosts, hosts managed these luns by multipath module.
on target side, we have 2 SPs, and 4 fc ports on each SP, they are connected to 2 switches,
on initiator side, 2 fc ports each host, they are also connected to 2 switches, therefore, we can see 8 paths for each lun;

bug phenomenon is that 1 path loss happens periodly, accompany with logs as following acrossing all hosts,
Nov 28 15:52:32 BC6-CLC02-CC02 kernel: qla2xxx [0000:06:00.0]-801c:8: Abort command issued nexus=8:3:17 -- 1 2002.
Nov 28 15:52:32 BC6-CLC02-CC02 kernel: qla2xxx [0000:06:00.0]-801c:8: Abort command issued nexus=8:3:16 -- 1 2002.
Nov 28 15:52:32 BC6-CLC02-CC02 kernel: qla2xxx [0000:06:00.0]-801c:8: Abort command issued nexus=8:3:18 -- 1 2002.
... ...

after traced the log, we locate the spurious fc port: (portid 0x10000) on SPA,
sadly, we has no capable of figuring out whether its root cause is hardware or software.
the faulty port connects to port 0, look into switch log, we can find that if the following cmd seq appears, the bug will happen,
so we need your help, how to decipher these log? can these log tell us the root cause?
SW6510:FID128:admin> portlogshow 0
time task event port cmd args
-------------------------------------------------
01:03:30.151 PORT scn 0 22 00000002,43020000,00000001
01:03:30.151 PORT scn 0 2 83affb78,00000000,00000002
01:03:30.151 PORT scn 0 2 83affb78,00000000,00000080
01:03:30.151 PORT scn 0 5 00000000,00000000,00000002
01:03:30.151 PORT scn 0 1 00000002,43020000,00000002
01:03:30.151 PORT scn 0 22 00000002,43020000,00000001
01:03:30.163 PORT ioctl 08010004 7,0 * 4
01:03:31.122 nsd0 rscn 1 10100 00fffffd,61040008,00010000,00000000
01:03:31.122 FCPH write 1 8 00010100,00fffffd,00000000,00000000,00000000
01:03:31.122 FCPH seq 1 8 00210000,00000000,00000692,00010180,00000000
01:03:31.123 PORT Tx3 1 8 22010100,00fffffd,0313ffff,61040008
01:03:31.123 PORT Rx3 1 4 23fffffd,00010100,031300f4,02000000
01:03:31.123 nsd0 rscn 2 10200 00fffffd,61040008,00010000,00000000
01:03:31.123 FCPH write 2 8 00010200,00fffffd,00000000,00000000,00000000
01:03:31.123 FCPH seq 2 8 00210000,00000000,00000692,00010180,00000000
01:03:31.123 PORT Tx3 2 8 22010200,00fffffd,0316ffff,61040008
01:03:31.124 PORT Rx3 2 4 23fffffd,00010200,031606be,02000000
01:03:31.124 nsd0 rscn 43 12b00 00fffffd,61040008,00010000,00000000
01:03:31.124 FCPH write 43 8 00012b00,00fffffd,00000000,00000000,00000000
01:03:31.124 FCPH seq 43 8 00210000,00000000,00000692,00010180,00000000
01:03:31.124 PORT Tx3 43 8 22012b00,00fffffd,0315ffff,61040008
01:03:31.125 PORT Rx3 43 4 23fffffd,00012b00,031500ea,02000000
01:03:31.125 nsd0 rscn 47 12f00 00fffffd,61040008,00010000,00000000
01:03:31.125 FCPH write 47 8 00012f00,00fffffd,00000000,00000000,00000000
01:03:31.125 FCPH seq 47 8 00210000,00000000,00000692,00010180,00000000
01:03:31.125 PORT Tx3 47 8 22012f00,00fffffd,030effff,61040008
01:03:31.125 PORT Rx3 47 4 23fffffd,00012f00,030e007f,02000000
01:03:31.125 nsd0 rscn 3 10300 00fffffd,61040008,00010000,00000000
01:03:31.126 FCPH write 3 8 00010300,00fffffd,00000000,00000000,00000000
01:03:31.126 FCPH seq 3 8 00210000,00000000,00000692,00010180,00000000
01:03:31.126 PORT Tx3 3 8 22010300,00fffffd,0312ffff,61040008
01:03:31.126 PORT Rx3 3 4 23fffffd,00010300,0312065b,02000000
01:03:31.126 PORT Rx3 2 20 02fffffc,00010200,06bfffff,01000000
01:03:31.127 nsd0 rscn 4 10400 00fffffd,61040008,00010000,00000000
01:03:31.127 FCPH write 4 8 00010400,00fffffd,00000000,00000000,00000000
01:03:31.127 FCPH seq 4 8 00210000,00000000,00000692,00010180,00000000
01:03:31.127 PORT Tx3 4 8 22010400,00fffffd,0319ffff,61040008
01:03:31.127 PORT Rx3 4 4 23fffffd,00010400,03190560,02000000
01:03:31.127 nsd0 rscn 5 10500 00fffffd,61040008,00010000,00000000
01:03:31.127 FCPH write 5 8 00010500,00fffffd,00000000,00000000,00000000
01:03:31.127 FCPH seq 5 8 00210000,00000000,00000692,00010180,00000000
01:03:31.128 PORT Tx3 5 8 22010500,00fffffd,0310ffff,61040008
01:03:31.128 PORT Rx3 5 4 23fffffd,00010500,0310044b,02000000
01:03:31.128 nsd0 rscn 6 10600 00fffffd,61040008,00010000,00000000
01:03:31.128 FCPH write 6 8 00010600,00fffffd,00000000,00000000,00000000
01:03:31.128 FCPH seq 6 8 00210000,00000000,00000692,00010180,00000000
01:03:31.128 PORT Tx3 6 8 22010600,00fffffd,0317ffff,61040008
01:03:31.129 PORT Rx3 6 4 23fffffd,00010600,031708bf,02000000
01:03:31.129 PORT Rx3 3 20 02fffffc,00010300,065cffff,01000000
01:03:31.129 nsd0 rscn 7 10700 00fffffd,61040008,00010000,00000000
01:03:31.129 FCPH write 7 8 00010700,00fffffd,00000000,00000000,00000000
01:03:31.129 FCPH seq 7 8 00210000,00000000,00000692,00010180,00000000
01:03:31.129 PORT Tx3 7 8 22010700,00fffffd,0314ffff,61040008
01:03:31.130 PORT Rx3 7 4 23fffffd,00010700,031408dd,02000000
01:03:31.130 PORT Rx3 4 20 02fffffc,00010400,0561ffff,01000000

 

Contributor
Posts: 66
Registered: ‎12-24-2015

Re: need help for some switchlog explanation

Hi!

Please, execute "fabriclog -s"

Occasional Contributor
Posts: 6
Registered: ‎11-28-2016

Re: need help for some switchlog explanation

Hi,

Please find the log, Would you please help me in getting this understand? this port is connected to san server,

 

Switch 0; Tue Nov 29 01:03:30 2016 GMT (GMT+0:00)
01:03:30.151918 SCN Port Offline;g=0xca2                    D2,P0  D2,P0  0     NA   
01:03:30.151937 *Removing all nodes from port               D2,P0  D2,P0  0     NA   
01:03:35.005810 SCN LR_PORT(0);g=0xca2                      D2,P0  D2,P0  0     NA   
01:03:35.005851 SCN Port Online; g=0xca2,isolated=0         D2,P0  D2,P1  0     NA   
01:03:35.006011 Port Elp engaged                            D2,P1  D2,P0  0     NA   
01:03:35.006073 *Removing all nodes from port               D2,P0  D2,P0  0     NA   
01:03:35.006216 SCN Port F_PORT                             D2,P1  D2,P0  0     NA   
14:21:49.566324 SCN Port Offline;g=0xca4                    D2,P0  D2,P0  0     NA   
14:21:49.566343 *Removing all nodes from port               D2,P0  D2,P0  0     NA   
14:21:54.410777 SCN LR_PORT(0);g=0xca4                      D2,P0  D2,P0  0     NA   
14:21:54.410817 SCN Port Online; g=0xca4,isolated=0         D2,P0  D2,P1  0     NA   
14:21:54.410977 Port Elp engaged                            D2,P1  D2,P0  0     NA   
14:21:54.411037 *Removing all nodes from port               D2,P0  D2,P0  0     NA   
14:21:54.411180 SCN Port F_PORT                             D2,P1  D2,P0  0     NA   
Switch 0; Wed Nov 30 08:25:29 2016 GMT (GMT+0:00)
08:25:29.506731 SCN Port Offline;g=0xca6                    D2,P0  D2,P0  0     NA   
08:25:29.506750 *Removing all nodes from port               D2,P0  D2,P0  0     NA   
08:25:34.348903 SCN LR_PORT(0);g=0xca6                      D2,P0  D2,P0  0     NA   
08:25:34.348944 SCN Port Online; g=0xca6,isolated=0         D2,P0  D2,P1  0     NA   
08:25:34.349105 Port Elp engaged                            D2,P1  D2,P0  0     NA   
08:25:34.349167 *Removing all nodes from port               D2,P0  D2,P0  0     NA   
08:25:34.349310 SCN Port F_PORT                             D2,P1  D2,P0  0     NA   

Contributor
Posts: 66
Registered: ‎12-24-2015

Re: need help for some switchlog explanation

bb143,

It's look good from switch side.

I think it is server's trouble. For the first make portstatsclear and portstatsshow for your potential trouble port an hour after the statistics reset.

Brocade Moderator
Posts: 36
Registered: ‎04-27-2009

Re: need help for some switchlog explanation

Hi,

 

the portlog at this event simply shows you the port 0 going offline and RSCNs are sent out to other ports.

The fabriclog -s shows you the Port 0 is going offline 3 times, notice 5 seconds each time. From the switch perspective the Device is just dropping light for 5 seconds and goes back online again.

 

I would agree to check this phenomenon on the Server side first. Try to find out why its going offline for 5 secs. Is anything done at this time or is it just happening sporadicly. Or. e.g checking porstatsshow for any error counter increasing.

Any and all information provided by me is not reviewed, approved or endorsed by Brocade and is provided solely as a convenience for Brocade customers. All systems and all networks are different and unique. If you have a service affecting network problem, please open a TAC service request for service through Brocade, or through your OEM equipment provider. If this provided you with a solution to this issue, please mark it with the button at the bottom "Accept as solution".
Occasional Contributor
Posts: 6
Registered: ‎11-28-2016

Re: need help for some switchlog explanation

[ Edited ]

Hi,

Appreciate for the timely response.

Assume it is server issue, based on previous switch side log, can we narrow it down to hba card issue(faulty hw/driver/firmware)?
besides, is there any chance FOS firmware would cause such log?
Brocade Moderator
Posts: 36
Registered: ‎04-27-2009

Re: need help for some switchlog explanation

Hi again,

 

from a fabriclog perspective we just see the 5 second light drop in your log which does not allow any more conclusions as to why this happens.

I 'd say never say never but from my exprience I really doubt that its switch related/initiated. If you have e.g. FW or MAPS monitoring have a look if anything unusual is logged in the errdump at this times, something like physical layer issues, CRC, Link Resets, etc...

 

TL

 

 

 

Any and all information provided by me is not reviewed, approved or endorsed by Brocade and is provided solely as a convenience for Brocade customers. All systems and all networks are different and unique. If you have a service affecting network problem, please open a TAC service request for service through Brocade, or through your OEM equipment provider. If this provided you with a solution to this issue, please mark it with the button at the bottom "Accept as solution".
New Contributor
Posts: 2
Registered: ‎12-01-2016

Re: need help for some switchlog explanation

[ Edited ]

Hello guyz,

 

Appreciate for the timely response.

 

Assume it is server issue, based on previous switch side log, can we narrow it down to hba card issue(faulty hw/driver/firmware)?
 
Best regards
angellily
<link removed by admin>

Join the Community

Get quick and easy access to valuable resource designed to help you manage your Brocade Network.