06-14-2009 04:14 AM
I have had 2 x Brocade 5300 switches fail becaus ethey cannot 'see' their firmware at all, this one has died in a fashion similar to another case with a Brocade 12000 that I saw on this comminuty.
Switch starts to boot and then cannot find /dev/hda2 or /dev/hda1
output is as follows:
The system is coming up, please wait...
U-Boot 1.1.3 (Jan 27 2009 - 11:10:05)
CPU: 8548_E, Version: 2.0, (0x80390020)
Core: E500, Version: 2.0, (0x80210020)
CPU:1199 MHz, CCB: 399 MHz,
DDR: 199 MHz, LBC: 49 MHz
L1: D-cache 32 kB enabled
I-cache 32 kB enabled
CPU Board Revision 255.198 (0xffc6)
DRAM: Initializing DDRSDRAM
memsize = 400
DDR: 1024 MB
Now running in RAM - U-Boot at: 3fb8e000
trap_init : 0x0
system inventory subsystem initialized
FLASH: 4 MB
L2 cache 512KB: enabled
ATA interface setup: io_base=0xf8f00000, port=0x1f0, ctl=0x3f6
Skip our host bridge
00 11 1657 0011 0280 1a
00 12 1657 0011 0280 1a
00 13 1657 0011 0280 1a
01 01 1657 0011 0280 1a
01 02 1657 0011 0280 1a
01 03 1657 0011 0280 1a
01 04 1657 0011 0280 32
00 14 3388 0022 0604 00
02 01 1657 0011 0280 1a
02 02 1657 0011 0280 1a
02 03 1131 1561 0c03 1a
02 03 1131 1562 0c03 1a
00 15 3388 0022 0604 00
ENET0: PHY is Broadcom BCM5241 10/100 BaseT PHY (143bc31)
ENET1: PHY is not applicable
ENET2: PHY is not applicable
ENET3: PHY is not applicable
Checking system RAM - press any key to stop test
Checking memory address: 00100000
System RAM test using Default POST RAM Test succeeded.
set_bootstatus: BS_LOAD_OS, platform_idx = 6
Hit ESC to stop autoboot: 4 3 2 1 0
Map file at LBA sector 0x1bba40
## Booting image at 00400000 ...
Image Name: Linux-220.127.116.11
Image Type: PowerPC Linux Multi-File Image (gzip compressed)
Data Size: 2890663 Bytes = 2.8 MB
Load Address: 00000000
Entry Point: 00000000
Image 0: 1814145 Bytes = 1.7 MB
Image 1: 1076503 Bytes = 1 MB
Uncompressing Multi-File Image ... ## Current stack ends at 0x3FB6CBC0 => set upper limit to 0x00800000
## initrd at 0x005BAED0 ... 0x006C1BE6 (len=1076503=0x106D17)
Loading Ramdisk to 1fef9000, end 1ffffd17 ... OK
initrd_start = 1fef9000, initrd_end = 1ffffd17
## Transferring control to Linux (at address 00000000) ...
mpc85xx_setup: Doing Pcie bridge setup
SILKWORM_HWSEM: This BD 64 is not supported
PCI: Cannot allocate resource region 2 of PCI bridge 1
PCI: Cannot allocate resource region 2 of PCI bridge 2
Installing Linux 2.6 Kernel
Attempting to find a root file system on hda2...
mount: Mounting /dev/hda2 on /mnt failed: No such device or address
Failed to mount hda2 as root!
Attempting to find a root file system on hda1...
Failed to mount a root file system!
Both boot devices are inconsistent. You can run sbin/repairfs.sh to boot up the system. After the system is booted up, please run firmwaredownload to load the appropriate version of firmware.
BusyBox v0.60.5 (2006.06.20-01:11+0000) Built-in shell (ash)
Enter 'help' for a list of built-in commands.
sh: can't access tty; job control turned off
. : alias bg break builtin cd chdir continue eval exec exit export
false fc fg hash help jobs kill let local pwd read readonly return
set setvar shift times trap true type ulimit umask unalias unset
!!!The following is me trying the command above to fix the filesystem - and failing!!!
bin etc lib mnt root tmp var
dev fabos linuxrc proc sbin usr
# cd sbin
bootenv insmod rmmod
check_xfs modprobe route
e2fsck pivot_root syschk_add_badroot
halt reboot syschk_find_shutdown
mount: Mounting /dev/hda1 on /mnt failed: No such device or address
./repairfs.sh: /mnt/usr/bin/reboot: No such file or directory
# cd /
bootenv: Could not remove requested variable BadRootDev.
mount: Mounting /dev/hda1 on /mnt failed: No such device or address
sbin/repairfs.sh: /mnt/usr/bin/reboot: No such file or directory
As you can see, I have tried the sbin/repairfs.sh and it fails to find /dev/hda1.
This is the 3rd Brocade 5300 that we have seen fail with firmware issues - there is clearly a problem with them.
does anyone have a fix for this or is this a swap out job?
06-14-2009 05:28 AM
it is very suspect, that 3 Brand New 5300 crash at a same time.
I know this error, and is most caused by Incorrect Firmware Upgrade, or Power Off during the Upgrade as certain Reason, Ex. lost of Power or Forced Power Cyrcle by Customer before the Upgrade is completed.
06-15-2009 01:46 AM
I agree, 3 x 5300 crashing at the same would indicate that I have screwed up somewhere - but luckily for me that did not happen:
In April 2009 we did a site move, we moved 2 x 5300 switches from the Production datacentre to a new location, on power up one of then didn't work, they were both firmware upgraded in the same manner using the same code, using the firmwaredownload command back in February 2009.
June 13th 2009 11:00, we powered down another set of 5300 in the DR site during a maintenance window. On power up one of the switches did not work.(see my other 'Brocade Fails to Boot - Bad Magic Number' entry)
June 13th 2009 14:00, we have replaced the broken 5300, now we want to firmware update all 8 switches in our SAN from 6.2.0b to v6.2.0g using Brocade's DCFM (4 x 5300 & 4 x 4100) we did the blue fabric first (2 x 5300 & 2 x 4100) and this was a success.
Next we firmware update the Red Fabric (2 x 5300 & 2 x 4100) all went ok except for one of the 5300 switches which never got past the firmware download process, it stopped responding, it could not be pinged.
When we got to the data centre to investigate the switch was doing as described in my posting.
So... how come the Blue fabric upgraded fine, but a switch in the Red Fabric failed - spectacularly?
how come I have had two switches in two separate occations fail to power up?
how come I have a switch fail to survive a firmware update.
I've been working with Brocade equipment for nearly 7 years, I have upgraded 100's of switches and I have never seen this before.
I know all about how a firmware update works, I know about the firmware commit process, I have been on the Brocade courses.
I'm thinking that 5300's have a problem with their Flash memory.
06-15-2009 09:33 AM
---I agree, 3 x 5300 crashing at the same would indicate that I have screwed up somewhere - but luckily for me that did not happen:
I do not have said in my last reply you haved screwed to the switches.
--- So... how come the Blue fabric upgraded fine, but a switch in the Red Fabric failed
This is a good question, and this can have diverse reason.
---I've been working with Brocade equipment for nearly 7 years, I have upgraded 100's of switches
I work with Brocade Equip. since 9-10 Years and i am sure i have Good Experince with this hardware, have thousends of Upgrade also in large Fabrics and never any Impact.
Ask you self by Brocade, i dont have never Create any RMA, ask for Support or similary things, i am sure Brocade can confirm this.
I have here 2 Brand NEW ! Brocade 48000 CP Destroyed by any "Brocade Certified Super Guru Admin" which mean after the First reboot must Power Down the switch. One week later was this person unemployed.
100 or 1000 installation has no meaning, not the quantity but a quality is Important.
I have asked about your Problem diverse Brocade SE, nobody have heard from a Problem with a Flash Card or Hardware with the 5300.
I have check today all release notes since 6.1 and cannot find any Defect or Closed defect with the Flash.
This is my opinion.
You can only solve the problem with the Brocade Support, because the corrupted FOS cannot be re-loaded by Custromer without certain tools.