07-03-2012 05:00 AM
We started having some problems with our servers (W2k8 R2) getting reboots and freezes that are connected to our SAN, when I started looking at the problem I can see that on a server with problem one of the Fabric paths (disks) to the arrays Isn’t showing up but it is still Zoned, enabled in the config and the port is online. If I switch the cable to another port then I can see the disks again (in Emulex Onecommand). If I remove the HBA and enable it again it will sometimes show the EVA but not all the paths only one of four.
Another problem I can see is that our ISL on our switch with the storage arrays (HPEVA) has almost as much Time BB_Credit Zero as C3 frames and its configured 8Gb/s, our servers are 4Gb. I guess this is not normal.
After reading about ISL problems here and also I got a recommendation to configure the ISL to use 4Gb instead, could that solve some of my problems I’m having?
And how do I get the misbehaving ports to show the disks again? I tried to disable them and also change server on the port but that didn’t help.
Since it is holiday times and it’s hard to find a person with good knowledge on SAN switches which I don’t have so I thought I could share my problems here.
The switches are of model,
4100 4G 6.4.2b
5100 8G 6.4.2b
And FABRIC A & B looks like this,
Switch3(5100) ---<problem ISL>---switch4(5100) ----(EVA)
Servers on swich4 seems okay at the moment.
07-10-2012 02:36 PM
Somehow I can't imagine that setting down the ISL to 4g will solve your problem. We had similar issues in our environment but with slightly different components. Usually high buffer zero values are a indication for a so called slow drain device. On which side of the ISL you see those high values (I assume on the side of switch 4)? What is your average payload per FC frame (do a statsclear and after 2-3 hours a portstats64show on the ISL port)? How long is the distance for that ISL? Do you see increased values on other ports as well (maybe F-ports)?