06-25-2014 10:25 AM
Hi, a host provider not identified here gave me this explantion for an outage. I am curious about your thoughts on this.
Early yesterday morning, one of the core switches experienced a packet-flood event that impacted traffic flow in one of our data centers. Due largely in part to the network's redundant and multiple-path-option connections (designed specifically to mitigate a number of device failures), the problem was able to cascade beyond the single device that was the flood's origin. Once the offending interface was identified, it was disconnected from the core and traffic normalized. It must be well understood that this event was an anomaly, and not the result of a misconfiguration, poor design, or any type of human intervention or error.
06-26-2014 05:43 AM
Sorry the message is too generic to say. That could mean anything )from a broadcast storm or loop at L2 to DDoS that was biggger then the mitegation). The best you can do is ask for a root caurse inedent report.
06-26-2014 06:29 AM
Thank you for the reply. To me it is a very poor RCA and we have had many issues with the provider. The only additonal info I was able to get was the switch was a Foundry BigIron 8000. I supsect that is way past support so that is all I have to fight them on.
06-26-2014 06:39 AM
You are correct, the bigiron 800 was around in about 2004 I think, would not of had code updates in a very long time I would think. Not a Brocade issue so much as an ISP running way past the ROI on the device - but this is only a guess :)
06-26-2014 07:37 AM
From what I can find it came out in 1998. Not sure how long it was sold or EOL on it. The code level was V 8 which I think was 2007. I brought up your suggestion regarding independent RCA and they seemed to like it. If this was a DDOS we need to know. If it was because the vendor is using outdated equipment we need to deal with that.