03-11-2013 11:05 AM
I'm troubleshooting a performance problem in a storage system. One component is the SAN fabric, which stretches the ISLs almost 9 km.
I don't have first hand access to the switches, nor have I had anything to do with the setup. I did however request a portstatsclear on the ISL ports last week, and these numbers were taken a few days after that, one one of the ISL ports:
|stat64_wtx||4 652 451 722 330|
|stat64_wrx||2 440 995 825 961|
|stat64_ftx||14 255 695 052|
|stat64_frx||11 089 451 101|
|stat64_c3_frx||11 089 834 680|
|tim64_rdy_pri||7 226 622|
|tim64_txcrd_z||14 338 091 729|
|stat64_rateTxWord||17 260 152|
|stat64_rateRxWord||25 359 007|
|stat64_rateTxPeakWord||48 297 580|
|stat64_rateRxPeakWord||52 986 889|
If my math is true, then the average payload is 1305 bytes / frame. The ISL is configured for a distance of 15 km, the actual distance is 8.7 km. At first glance, it would seem I should have enough buffers, but the value for tim64_txcrd_z is 14338091729 x 2.5 us = almost 10 hours?!
Some sanity checks would be appreciated.
03-12-2013 08:52 AM
Those figures does not seem to be very good indeed. Just comparing the following two counters we can see that there is something strange going on...
tim64_txcrd_z 14 338 091 729
stat64_ftx 14 255 695 052
If the extended fabric license is installed, as it looks, you can increase a little bit more the distance, so that the ISLs obtain more BB credits.
How many ISLs do this fabric has? And how many hosts are using them?
Do you see these figures in all of them?
In the other fabric?
Do you see any discards? where?
03-26-2013 07:42 AM
hi, if you carefully read the description of tim_txcrd_z, the meaning is as follows: there's a process that counts free buffers on a port 400000 times per second. when there are zero free buffers, the counter is increased. that's it. in other words, if the counter increases, it means that all buffers are allocated during the fair amount of time. it doesn't mean "i wanted to send a frame but there was no free buffer", so it doesn't necessarily mean that there are not enough buffers. it might, but you can't tell it from this point of view.
i agree that increasing the number of buffers might help to understand the issue better.
i would also advise to configure bottleneckmon stuff.
are there any discards seen in the other parts of the san?
03-29-2013 03:07 PM
What is the port speed configured? Is QoS also configured? There is not enough information in your post to determine if enough BB credits are configured. I agree that bottleneck monitoring can help to isolate peformance problems as a slow drain device can cause the BB credit pool to become starved. The Managing Long Distance Fabrics section of the FOS Admin guide has a formula for calculating BB credits based on average frame size. I often find mistakes made here as admins configure BB credits for full sized frames when in fact most distance applications utilize smaller frames. This will require more BB credits to be configured.