Fibre Channel (SAN)

Reply
Contributor
Posts: 20
Registered: ‎08-17-2011

ISL port problem and fabric connection problem

Hi

this is Max Gregis from Milan, Italy.

in my fabric with some Directors 48000, i've found a strange problem.

when I've found a broken ISL port on my director, on my FC link on some solaris hosts , i've found in /var/adm/messages thousands of messages like

Jan 12 23:51:47 serverhost scsi: WARNING: /scsi_vhci (scsi_vhci0):
Jan 12 23:51:47 serverhost        /scsi_vhci/ssd@g60060e800427d300000027d300000a61 (ssd3): Command Timeout on path /pci@7c0/pci@0/pci@1/pci@0,2/SUNW,emlxs@1/fp@0,0 (fp2)

and, when i've disabled that ISL port in persistent mode, those messages are vanished. When i re enable that broken ISL port , those messages on solaris hosts, come back.

anyone knows this stragne relationship from ISL port problem and these generic problems on hosts connections and on fabric, as well?

Thanks in advance

Max

External Moderator
Posts: 4,974
Registered: ‎02-23-2004

Re: ISL port problem and fabric connection problem

show to me as any FLOGI and/or PLOGI issue.

check the Device is registered in Name Server, and Device Probing is Enable if the Port is not set as N_Port.

TechHelp24
Contributor
Posts: 20
Registered: ‎08-17-2011

Re: ISL port problem and fabric connection problem

Hi ,

thsnkls for your answer

first of all, out internal SAN capability group already changed the broken SFP, now the port is up&running.

:admin> switchshow | grep " 7 " | grep " 0 "
64    7    0   574000   id    N4   Online           E-Port  50:00:51:ee:7e:6b:0e:57 "fcr_fd_160" (Trunk master)

with nsshow i don't find port index 64:

:admin> nsshow

{
Type Pid    COS     PortName                NodeName                 TTL(sec)
.
.
.
.
N    573f00;      3;50:06:0e:80:04:f2:e9:01;50:06:0e:80:04:f2:e9:01; na
    FC4s: FCP
    Fabric Port Name: 20:3f:00:05:1e:36:2e:14
    Permanent Port Name: 50:06:0e:80:04:f2:e9:01
    Port Index: 63
    Share Area: No
    Device Shared in Other AD: No
    Redirect: No
N    574100;      3;50:06:0e:80:04:5c:3e:02;50:06:0e:80:04:5c:3e:02; na
    FC4s: FCP
    Fabric Port Name: 20:41:00:05:1e:36:2e:14
    Permanent Port Name: 50:06:0e:80:04:5c:3e:02
    Port Index: 65
    Share Area: No
    Device Shared in Other AD: No
    Redirect: No

.
.
.

about Device probing the info are:

:admin> fcplogshow | more      
Time Stamp   Event          Port  file&lineno arg0     arg1     arg2     arg3     arg4   
======================================================================================
20:35:08.301 FlshOrProbe    64    1  642      2       : 0       : 0        : 0         : 0
20:35:08.301 ProbeFlsh      64    1  3225     0        : 0       : 0        : 0         : 0

:admin> fcpprobeshow 7/0

        port 64 is not an FL_Port or an F_Port

admin> portshow 7/0
portName:
portHealth: HEALTHY

Authentication: None
portDisableReason: None
portCFlags: 0x1
portFlags: 0x10000903    PRESENT ACTIVE E_PORT T_PORT T_MASTER G_PORT U_PORT LOGICAL_ONLINE LOGIN
portType:  10.0
portState: 1    Online  
portPhys:  6    In_Sync 
portScn:   16   E_Port    Trunk master port

.

.

.

.

Please let me know if you need other info

Thanks again

Max

External Moderator
Posts: 4,974
Registered: ‎02-23-2004

Re: ISL port problem and fabric connection problem

Max,

Sorry but from you last Post, i don't see any relation with native question.

now, what is the really probem ?

TechHelp24
Contributor
Posts: 20
Registered: ‎08-17-2011

Re: ISL port problem and fabric connection problem

hi,

thanks for your answer.

now ISL port is new, but scsi timeout on solaris server keep going on.

an internal HP info says when an ISL port fails, this port flipping and create problems on scsi channels for solaris OS.

Really, i don't know this thing, and i don't really know if broken ISL is THE REAL problem.

i've tried to write on forum for knowing if that HP note is true or not, and if there is a relationship between SCSI timeout on solaris servers and a ISL broken port on switch, or a broken port in general.

Now my switch is in marginal state because there is  another port in marginal state, that port in NOT an ISL but a "simple" F PORT for another IBM host.

Pleas let me know if you need other info.

Valued Contributor
Posts: 931
Registered: ‎12-30-2009

Re: ISL port problem and fabric connection problem

Just me keeping it simple, if you disable the messages disappear, once reenabled the messages reapear, Looks like the fault lies in the ISL or Ports or SFP's used.

Have you tried moving the ISL to a different port to exclude the physical layer?

External Moderator
Posts: 4,974
Registered: ‎02-23-2004

Re: ISL port problem and fabric connection problem

I'm sorry, but this all is a bit confused for me.

--->>> now ISL port is new, but scsi timeout on solaris server keep going on...

what have ISL = ( Inter Switch Link ) to do with Solaris / Server SCSI error ?

--->>> i've tried to write on forum for knowing if that HP note is true or not, and if there is a relationship between SCSI timeout on solaris servers and a ISL broken port on switch, or a broken port in general

I don't see any relationship betwen both

......but this is my opinion

TechHelp24
Contributor
Posts: 20
Registered: ‎08-17-2011

Re: ISL port problem and fabric connection problem

Hi Dion,

thanks for your answer.

Yes, that test is already done.  when we changed sfp and reconnected ISL link, scsi timeout  are nearly completely passings.

but i don't know the relationshiop between a broken ISL sfp and scsi timeout in solaris host.

HP italian support says there is an internal (HP and Brocade) note that reports this problem:" when an ISL sfp go in fault, that SFP creates a flipping on entire switch and create scsi problem on solaris hosts".

My opinion is that "internal" note is very strange .............never heard a thing like that......... but anyway .......

Contributor
Posts: 20
Registered: ‎08-17-2011

Re: ISL port problem and fabric connection problem

i agree with you,

in fact i've just reported this notice from HP support.

i forward you my aswer to Dion:

HP italian support says there is an internal (HP and Brocade) note that reports this problem:" when an ISL sfp go in fault, that SFP creates a flipping on entire switch and create scsi problem on solaris hosts".

My opinion is that "internal" note is very strange .............never heard a thing like that ...... but anyway .......

Valued Contributor
Posts: 931
Registered: ‎12-30-2009

Re: ISL port problem and fabric connection problem

Well it could be explained as follows; a unstable line that services some IO, would have the host retransmit the lost frames. When excessive retransmission need to take place, it eventually reports it as SCSI timeout, because the SCSI layer isn't getting the response back in a reasonable/configured time period.

Join the Community

Get quick and easy access to valuable resource designed to help you manage your Brocade Network.