07-15-2011 10:34 PM
I am seeing what I would call extremely slow performance on my fabric. What I do not know is what I should be expecting to see and I am hoping someone can assist me with answers.
I have a C7000 chassis with 10 blades all running ESX 4.1 U1. My DAVG is constantly high, but when I use Web Tools on the Brocade switch it never goes above .8, in fact, I would say 1m, is all I ever see. I thought it might be a SAN Storage problem, but I get about the same performance when writing to my VTL or LTO4 drives as well. Not sure where to look, I have replaced the GBICS, fiber cables, HBAs and I have also sent logs to both brocade and hp support and no one seems to see a problem.
What I am wondering, is this normal activity on a FC Fabric, or should I continue to investigate. Also, can anyone tell me how to make a good test plan, I would feel very comfortable if I was able to see throughput of 2,3 or 4 Gbps... To date, I just cannot seem to make this fabric move.
Please tell me where I am going wrong.
What you see in both attached files are a good day, but to me I should be seeing much better throughput. Ports 12, 21 and 22 are my backup exec server talking to my VTL and LTO drives. Again, this is a good day.
On one occasion in the past 3 years, I did see much better throughput, but it wasn't until I unplugged the chassis from the power cable. Not something I do often.
07-18-2011 05:48 AM
The SAN is the least likely suspect during such cases. Also the fact that you are DAVG and then comparing it to throughput values on the switch, makes no sense to me.
As for a test plan, assing a test LUN to one of the VM's, and generate artificial IO ( say using IOmeter) while comparing metrics on Storage, switch and server. Pay more attention to storage and server performance metrics, as SANs are the least likely cause.
If your fabric consists of only two switches with servers and storage directly connected, then you can almost rule out any possibility of SAN performance issues.
What you need is some performance analysis of the storage to provide you response times there and then compare it to DAVG metrics.
If you are running the VMs on a EVA, then you need to collect EVA perf.