04-10-2013 12:32 PM
Hi all -- am fairly new to the world of fiber and Brocade, but nevertheless am running into an odd issue that I *believe* I may have a theory on, but want to throw out to the rest of you to see if there's something else I may be missing.
We have a couple of Dell Blade Centers with Brocade switches embedded. Blades have QLogic HBA's and connect to the Brocades at 8Gbps. The Brocades uplink via two 8Gbps links each (ISL'd) to two IBM SAN24B stackable switches -- for now this is our "core" switch. Each core switch is Fabric A and Fabric B, respectively.
A single IBM Gen2 XIV (4Gbps ports only) hangs off the IBM SAN24B stackables and we're using WWPN based zoning throughout the environment.
We've been having poor performance from the initiators (blades) to two of the modules on the XIV (modules 7 & 9). The others work fine. We're seeing errors on the ISL links, however -- "enc out" (which points to something physical), but it's a bit confounding how these errors only show up when talking to the two modules and not to any others. We don't see any errors on the XIV itself, nor on the ports on the SAN24B connecting to the XIV.
Symptoms in addition to the "enc out" errors on the ISL's is that I/O will slow and grind to a complete halt in many cases. As soon as we disable the paths to modules 7 & 9, things pick up again.
We've done a lot of trial & error troubleshooting, including replacing SFP's to no avail. But, reading some best practices today, I ran across a potential culprit.
The cables from our XIV to the IBM SAN24B switches are 62.5/125 micron cables (OM1 I guess?). This is a very short run, so should be OK for 4Gbps. Our ISL links are all 50/125 OM2 (run is likely about 20m). From my reading, my understanding is that we should *not* be mixing 62.5 and 50 together (keep in mind that the modules that are working "fine" also have 62.5/125 cables, so that somewhat throws a kink in things).
Oddly enough, I moved the cables around this morning to read their cable type, and after doing this, I am no longer able to reproduce the errors we've been experiencing. It almost seems like I either resolved something wrong with the cable by moving it, or else some other coincidental event occurred.
Does any of the above make sense? I'm not confident with this "fix", and am thinking to be safe I should just replace all cables with 50/125 OM3. Sounds like you can mix 50/125 OM2 and 50/125 OM3, but no reason to do so if I can go all OM3 IMO.
Would appreciate any thoughts or suggestions. Firmware is up to date on all of our switches FYI.
04-12-2013 08:40 AM
Greetings Ray. Actually it does make sense and thank you for all the extra data. "enc out" errors quite often to point to "media" such as cables. If you search the community pages for "enc out", you will get a lot of supporting statements as well as guidance on how to clear the errors and monitor for any more occurrences.