06-08-2010 10:58 AM
We noticed a month ago (after upgrading switch firmware to 6.2.0c) that our 200E switches had high flash memory issues. When we would see this, we'd run "supportsave -R" and clear the problem. This worked fine for awhile. Then, recently we had a switch panic & reboot. We found one of admins was running the SAN health utility when this happened - no one else was logged on to the switch. So we had assumed something in the utility caused the flash memory to go so high it panicked.
However, we continue now to have SAN switch panics which are unrelated to SAN health report. The issue occurs mostly on the weekends - as if some weekly internal switch process is eating up the flash memory. From the event logs, it appears that the issue starts with a terminated process (the common process is "process.0.weblinker"). Then a message about failure data capture (FFDC) occurs. After that the log shows memory going well above 90% & then within moments there is a "flash out of range" message & trace dump then a switch reboot.
Running the supportsave command works great if you're online when this happens...but it doesn't help on the weekends. There's got to be a way to resolve this. We only have the problem on our 200E switches (the 300 switches get the same terminated process messages but aren't running high flash memory usage). We now have auto trace FTP setup to try and capture the trace dump. In the meantime, I was hoping to get some input from the experts. We can't be the only ones running into this problem. Also note, we recently moved from Fabric Mger to DCFM enterprise trial. Thanks!
12-23-2010 04:56 PM
Unfortunately I'm seeing this on some of my DCX and 48Ks. I have found that it is directly related to the SNMP polling done by DCFM.
I disabled the services for DCFM, and only use it when I need them.
By disabling the services, I haven't had another panic since then.
I would suggest disabling DCFM for awhile and see if the switch panics continue.
12-26-2010 06:55 PM
This is mostly related due to memory leaks in certain parts of FOS. Especially in FOS 6.2.x and 6.3.x there were some nasty oness which would trigger FTRACE dumps to flash and some kernel panics. This in turn would fill up one or more flash partitions and the switch wouldn't boot anymore.
Read the release notes of the respective FOS versions and make any effort to get this upgraded.