Fibre Channel (SAN)

Reply
Highlighted
Occasional Contributor
Posts: 11
Registered: ‎10-08-2014
Accepted Solution

HAM-1004 Processor reboot - trying to determine root cause

Hi,

    I've got a customer with a fairly old SAN setup and a couple of HP Branded Silkworm 300 switches. They are running FOS 6.4.2b and have been plodding along with no issues until recently. In the last couple of months one of them has begun to spontaneously reboot. I've been seeing the following in the errdump output 

 

2017/04/13-22:36:56, [HAM-1004], 1653, CHASSIS, INFO, Brocade300, Processor rebooted - Reset

 

This has occured about 4 times in the last 8 weeks.

 

I've looked through a supportsave to see if there are any alerts as I was wondering if perhaps there were thermal issues but all the switch components are reporting temps as nominal. Just looking for any advice on how to work back and perhaps determine the root cause of the reboot if it's not a physical issue.. 


I guess it could be a firmware... I am aware this is old firmware. Anyway if anyone has any pointers about where to look I'd be greatful. These switches are obviously out of warranty now but we do have hardware support from a 3rd party supplier so i could get the switch swapped out and the licence transferred. Or I could update the firmware. 

 

Advice welcome

 

thanks

Adam

Brocade Moderator
Posts: 251
Registered: ‎08-31-2009

Re: HAM-1004 Processor reboot - trying to determine root cause

Hello,

 

This is very difficult to determine the issue with only this message. The general recommendation for that kind of case will be to upgrade the switch at the latest firmware level available for eliminate this part of the potential issue as firmware is old.

 

Any and all information provided by me is not reviewed, approved or endorsed by Brocade and is provided solely as a convenience for Brocade customers. All systems and all networks are different and unique. If you have a service affecting network problem, please open a TAC service request for service through Brocade, or through your OEM equipment provider. If this provided you with a solution to this issue, please mark it with the button at the bottom "Accept as solution"
Brocade Moderator
Posts: 185
Registered: ‎03-29-2011

Re: HAM-1004 Processor reboot - trying to determine root cause

Ensure that you have console - serial connection - where you log the output from the console. And a syslog server configured and setup. Also check for power loss.  But otherwise I concur with Thierry - update the to last supported firmware, target path.


Any and all information provided by me is not reviewed, approved or endorsed by Brocade and is provided solely as a convenience for Brocade customers. All systems and all networks are different and unique. If you have a service affecting network problem, please open a TAC service request for service through Brocade, or through your OEM equipment provider. If this provided you with a solution to this issue, please mark it with the button at the bottom "Accept as solution"
Occasional Contributor
Posts: 11
Registered: ‎10-08-2014

Re: HAM-1004 Processor reboot - trying to determine root cause

[ Edited ]

thanks guys, I'll see if I can work out the upgrade path to a more recent FOS 7.x release.

 

Hopefully this will resolve the issues.

 

thank you.

 

addendum. According to this (http://community.brocade.com/t5/Fibre-Channel-SAN/Brocade-Fabric-OS-Target-Path-Technical-Brief/ta-p/63946) and the latest target path selection guide  I should be able to go from 6.4.2b  via the following route

 

FOS v6.4.2a → FOS v7.0.2c/d/e → FOS v7.1.1a/b/c* → FOS 7.2.1g → FOS 7.3.1d/e → FOS 7.4.1e

 

I assume this would be a non disruptive update?

 

Are there any steps I can skip if I am able to tolerate a reboot?

 

thanks

 

 

 

Brocade Moderator
Posts: 185
Registered: ‎03-29-2011

Re: HAM-1004 Processor reboot - trying to determine root cause

Correct. The below would be non disruptive

 

FOS v6.4.2a → FOS v7.0.2c/d/e → FOS v7.1.1a/b/c* → FOS 7.2.1g → FOS 7.3.1d/e → FOS 7.4.1e

 

The latest Target Path which was release end of last week is at

 

https://www.brocade.com/content/dam/common/documents/content-types/target-path-selection-guide/brocade-fos-target-path.pdf

 

Notice that for FOS 7.4.1e - Migrating from FOS v7.2 (7.4.1e RN)

Any 8G or 16G platform operating at FOS v7.2.x must be upgraded to FOS v7.3.x before upgrading to FOS v7.4.1e.

• Disruptive upgrade to FOS v7.4.1e from FOS v7.2 is not supported

 

If you can take disruptive upgrades, then you can take two steps - for example skip the 7.2 upgrade, e.g.

 

FOS v6.4.2a → FOS v7.0.2c/d/e → FOS v7.1.1a/b/c*  → FOS 7.3.1d/e → FOS 7.4.1e

 

7.3.1e release notes says:

 

Disruptive upgrade to FOS v7.3.1e from FOS v7.1 is supported.

 


Any and all information provided by me is not reviewed, approved or endorsed by Brocade and is provided solely as a convenience for Brocade customers. All systems and all networks are different and unique. If you have a service affecting network problem, please open a TAC service request for service through Brocade, or through your OEM equipment provider. If this provided you with a solution to this issue, please mark it with the button at the bottom "Accept as solution"
Occasional Contributor
Posts: 11
Registered: ‎10-08-2014

Re: HAM-1004 Processor reboot - trying to determine root cause

thanks for all the info. I've got one more question. I'm struggling to locate FOS 7.1.1x anywhwere on the hpe.com web sites...

 

Would it be ok to go the following route?

 

FOS v6.4.2b → FOS v7.0.2c/d/e → FOS v7.1.2b* → FOS 7.2.1g → FOS 7.3.1d/e → FOS 7.4.1e

 

 

thanks

 

 

Brocade Moderator
Posts: 251
Registered: ‎08-31-2009

Re: HAM-1004 Processor reboot - trying to determine root cause

Hello,

 

Yes will be good.

Any and all information provided by me is not reviewed, approved or endorsed by Brocade and is provided solely as a convenience for Brocade customers. All systems and all networks are different and unique. If you have a service affecting network problem, please open a TAC service request for service through Brocade, or through your OEM equipment provider. If this provided you with a solution to this issue, please mark it with the button at the bottom "Accept as solution"
Occasional Contributor
Posts: 11
Registered: ‎10-08-2014

Re: HAM-1004 Processor reboot - trying to determine root cause

[ Edited ]

thanks for all the advice, Guys. I was able to sucessfully uplift both switches to 7.4.1e. Sadly though this does not seem to have fixed my potential hardware issues on one of the switches.

 

I now suddenly have multiple faults as of this morning.

 

Index Port Address Media Speed State Proto
==================================================
0 0 0a1700 id 8G Online FC F-Port 50:06:0e:80:10:4d:84:e0
1 1 0a1500 id 8G No_Sync FC
2 2 0a1300 id N8 Hard_Flt FC
3 3 0a1100 id N8 Online FC F-Port 10:00:00:05:1e:fb:42:19
4 4 0a1600 id N8 Hard_Flt FC
5 5 0a1400 id N8 Online FC F-Port 10:00:00:05:1e:fb:3f:8c
6 6 0a1200 id N8 Hard_Flt FC
7 7 0a1000 id N8 Online FC F-Port 50:01:43:80:03:30:34:c2
8 8 0a0f00 id N8 Online FC F-Port 10:00:00:05:1e:fb:35:a4
9 9 0a0d00 id 4G Online FC L-Port 1 public
10 10 0a0b00 id N4 In_Sync FC
11 11 0a0900 id N8 In_Sync FC
12 12 0a0e00 id N4 Online FC F-Port 21:78:00:c0:ff:d7:21:22
13 13 0a0c00 id N8 Online FC F-Port 50:01:43:80:02:51:93:7c
14 14 0a0a00 id N8 Online FC F-Port 50:01:43:80:02:51:94:50
15 15 0a0800 id N8 In_Sync FC
16 16 0a0700 id N8 Online FC L-Port
17 17 0a0500 id N8 In_Sync FC
18 18 0a0300 id N8 Online FC F-Port 10:00:8c:7c:ff:21:07:fa
19 19 0a0100 id N8 Online FC F-Port 10:00:8c:7c:ff:20:eb:6e
20 20 0a0600 id N8 Online FC F-Port 10:00:8c:7c:ff:21:07:40
21 21 0a0400 id N8 No_Light FC
22 22 0a0200 -- N8 No_Module FC

 

All ports 0-20 were fine post update but this switch had had a few spontaneous reboots prior to the firmware update. I am guessing these all now point toward hardware issues?

 

the fabriclog appears to be showing some port flapping but these devices were fine previously so i am wondering if it's actually the switch which is at fault

 

08:46:28.534534 *Removing all nodes from port D0,P0 D0,P0 13 NA
08:46:28.709195 SCN LR_PORT(0);g=0x5df0 D0,P0 D0,P0 13 NA
08:46:28.709253 SCN Port Online; g=0x5df0,isolated=0 D0,P0 D0,P1 13 NA
08:46:28.709423 Port Elp engaged D0,P1 D0,P0 13 NA
08:46:28.709500 *Removing all nodes from port D0,P0 D0,P0 13 NA
08:46:28.709643 SCN Port F_PORT D0,P1 D0,P0 13 NA
08:46:30.284077 SCN Port Offline;g=0x5df2 D0,P0 D0,P0 15 NA
08:46:30.284096 *Removing all nodes from port D0,P0 D0,P0 15 NA
08:46:30.898566 *Removing all nodes from port D0,P0 D0,P0 5 NA
08:46:30.898708 SCN Port F_PORT D0,P0 D0,P0 5 NA
08:46:31.333173 SCN Port Offline;g=0x5df4 D0,P0 D0,P0 11 NA
08:46:31.333193 *Removing all nodes from port D0,P0 D0,P0 11 NA
08:46:32.593991 SCN Port Offline;g=0x5df6 D0,P0 D0,P0 13 NA
08:46:32.594009 *Removing all nodes from port D0,P0 D0,P0 13 NA
08:46:32.845692 SCN LR_PORT(0);g=0x5df6 D0,P0 D0,P0 13 NA
08:46:32.845741 SCN Port Online; g=0x5df6,isolated=0 D0,P0 D0,P1 13 NA
08:46:32.845913 Port Elp engaged D0,P1 D0,P0 13 NA
08:46:32.845993 *Removing all nodes from port D0,P0 D0,P0 13 NA
08:46:32.846137 SCN Port F_PORT D0,P1 D0,P0 13 NA
08:46:33.282719 SCN Port Offline;g=0x5df8 D0,P0 D0,P0 15 NA
08:46:33.282737 *Removing all nodes from port D0,P0 D0,P0 15 NA
08:46:34.334477 SCN Port Offline;g=0x5dfa D0,P0 D0,P0 11 NA
08:46:34.334498 *Removing all nodes from port D0,P0 D0,P0 11 NA
08:46:35.857793 SCN Port Offline;g=0x5dfc D0,P0 D0,P0 14 NA
08:46:35.857812 *Removing all nodes from port D0,P0 D0,P0 14 NA
08:46:36.111614 SCN LR_PORT(0);g=0x5dfc D0,P0 D0,P0 14 NA
08:46:36.111661 SCN Port Online; g=0x5dfc,isolated=0 D0,P0 D0,P1 14 NA
08:46:36.111835 Port Elp engaged D0,P1 D0,P0 14 NA
08:46:36.111915 *Removing all nodes from port D0,P0 D0,P0 14 NA
08:46:36.112060 SCN Port F_PORT D0,P1 D0,P0 14 NA
08:46:36.290822 SCN Port Offline;g=0x5dfe D0,P0 D0,P0 15 NA
08:46:36.290842 *Removing all nodes from port D0,P0 D0,P0 15 NA
08:46:36.653851 SCN Port Offline;g=0x5e00 D0,P0 D0,P0 13 NA
08:46:36.653872 *Removing all nodes from port D0,P0 D0,P0 13 NA
08:46:36.868517 SCN LR_PORT(0);g=0x5e00 D0,P0 D0,P0 13 NA
08:46:36.868572 SCN Port Online; g=0x5e00,isolated=0 D0,P0 D0,P1 13 NA
08:46:36.868744 Port Elp engaged D0,P1 D0,P0 13 NA
08:46:36.868823 *Removing all nodes from port D0,P0 D0,P0 13 NA
08:46:36.868966 SCN Port F_PORT D0,P1 D0,P0 13 NA
08:46:37.335304 SCN Port Offline;g=0x5e02 D0,P0 D0,P0 11 NA
08:46:37.335323 *Removing all nodes from port D0,P0 D0,P0 11 NA
08:46:39.290901 SCN Port Offline;g=0x5e04 D0,P0 D0,P0 15 NA
08:46:39.290920 *Removing all nodes from port D0,P0 D0,P0 15 NA
08:46:40.342782 SCN Port Offline;g=0x5e06 D0,P0 D0,P0 11 NA
08:46:40.342801 *Removing all nodes from port D0,P0 D0,P0 11 NA
08:46:40.712943 SCN Port Offline;g=0x5e08 D0,P0 D0,P0 13 NA
08:46:40.712962 *Removing all nodes from port D0,P0 D0,P0 13 NA
08:46:40.929195 SCN LR_PORT(0);g=0x5e08 D0,P0 D0,P0 13 NA
08:46:40.929251 SCN Port Online; g=0x5e08,isolated=0 D0,P0 D0,P1 13 NA
08:46:40.929420 Port Elp engaged D0,P1 D0,P0 13 NA
08:46:40.929499 *Removing all nodes from port D0,P0 D0,P0 13 NA
08:46:40.929643 SCN Port F_PORT D0,P1 D0,P0 13 NA
08:46:42.293870 SCN Port Offline;g=0x5e0a D0,P0 D0,P0 15 NA
08:46:42.293890 *Removing all nodes from port D0,P0 D0,P0 15 NA
08:46:42.489625 *Removing all nodes from port D0,P0 D0,P0 3 NA
08:46:42.489772 SCN Port F_PORT D0,P0 D0,P0 3 NA
08:46:43.347249 SCN Port Offline;g=0x5e0c D0,P0 D0,P0 11 NA
08:46:43.347268 *Removing all nodes from port D0,P0 D0,P0 11 NA
08:46:43.986892 *Removing all nodes from port D0,P0 D0,P0 19 NA
08:46:43.988040 SCN Port F_PORT D0,P0 D0,P0 19 NA
08:46:44.211151 *Removing all nodes from port D0,P0 D0,P0 18 NA
08:46:44.211298 SCN Port F_PORT D0,P0 D0,P0 18 NA
08:46:44.771924 SCN Port Offline;g=0x5e0e D0,P0 D0,P0 13 NA
08:46:44.771943 *Removing all nodes from port D0,P0 D0,P0 13 NA
08:46:45.119880 SCN LR_PORT(0);g=0x5e0e D0,P0 D0,P0 13 NA
08:46:45.119929 SCN Port Online; g=0x5e0e,isolated=0 D0,P0 D0,P1 13 NA
08:46:45.120102 Port Elp engaged D0,P1 D0,P0 13 NA
08:46:45.120179 *Removing all nodes from port D0,P0 D0,P0 13 NA
08:46:45.120324 SCN Port F_PORT D0,P1 D0,P0 13 NA
08:46:45.293326 SCN Port Offline;g=0x5e10 D0,P0 D0,P0 15 NA
08:46:45.293346 *Removing all nodes from port D0,P0 D0,P0 15 NA
08:46:46.347961 SCN Port Offline;g=0x5e12 D0,P0 D0,P0 11 NA
08:46:46.347980 *Removing all nodes from port D0,P0 D0,P0 11 NA

 

 

If I want to replace the chassis I assume I'll need to arrange for a licence transfer of the existing licences which I am guessing will be a vendor specific process (in my case i believe these are HP branded switches)..

 

So I am guessing you'll say replace the switch but thought it was worth asking.

 

thanks

Adam.

 

 

Join the Community

Get quick and easy access to valuable resource designed to help you manage your Brocade Network.