02-28-2017 08:18 AM
Hoping somebody could provide some guidance on what direction I can take to troubleshoot an issue I'm observing.
I have two fabrics, and I've just been focusing on one of them (Fabric A).
In this fabric I have:
6x6510 (Edges) (3 trunk groups on each edge switch going back to core, 4 ISLs per trunk, total of 72 ISLs)
My Servers are all located on the edge switches. My Arrays are located on the Core.
I'm running FOS 7.4.1e on all switches. I am running MAPS with default conservative policy. FPI Enabled.
I'm not seeing any C3 discards.
I'm not seeing any marginal links.
I don't have any Long-Distance ISLs.
What I am seeing:
All the ISLs in my trunk groups are being utilized, but throughput is below 30%. It's my understanding that the only time additional ISLs in a trunk group is used is after the first ISL becomes congested. This lead me to believe there was a BB credit issue.
From the Edge switches I am seeing very little tim_txcrd_z counters incrementing on the ISLs.
From the Core I am seeing a steady pace of incrementing tim_txcrd_z counters.
From the Core I am seeing >20% transmitted frames waiting for buffers.
(I calculated this by dividing tim_txcrd_z by stat_ftx.)
I'm seeing tim_tx_crd_z_vc on the following VCs: 0, 2, 3, 4, 5
(Bulk of them are on VCs 2-5)
My Storage ports are a mix of 16Gb & 8Gb N Ports
My server devices are mostly 8Gb N Ports
My only conclusion right now is the VC credit issue is being caused by the fact that the ISLs are 16Gb and the bulk of the servers are 8Gb.
02-28-2017 08:21 AM
I also have an analytical monitoring platform (AMP) appliance monitoring all switches in the fabric. I'm seeing a lot of alerts on first read response times. I believe these are indicating latency on the arrays.
02-28-2017 09:16 AM
I don't think that tx_crd_0 counter on ISL ports indicate any issues or high congestion. As probably you have QoS AE on ISL ports. so it means that every VC can borrow required credits from other VC's if needed. high counter value maight be due to 8G hosts in edge are to slow in read traffic acceptance as etch is zoned to multiple storage array ports in core. so if you have 8G host zoned to 2 or 3 16G storage array ports and probably that array is flash array your traffic will wait on core ISLs 1st. I would check if you see any tx_crd_0 high increment on any edge host ports. As FPI not report latency seams thats Fabric able to manage it.
02-28-2017 11:21 AM
Oh, I thought a happy AMP owner should be able to immediately see what's the problem :)
So it looks like you are talking about two different issues.
First is that some of your hosts are acting as slow drain devices. Link speed going down from 16G to 8G is likely to cause issues like that. Did you try to decrease the speed of your most busy storage ports down to 8G? I mean - if they are 16G right now, I understood you have some. But not only that. What are your hosts? A big physical servers? Or maybe they are not so big and also have some virtual stuff inside?
The second issue that AMP shows you is first read response. But that's ok for disk arrays, even for flash arrays. Controller needs to find the exact chunk of data between the attached media, including cache, this takes time. Modern disk arrays have their own multiple virtualization layers in them like thin provisioning etc... First read response is the time required for all the requested data chunk to get prepared for the transfer. This will be seen as an additional latency from the host side, BTW how much is it in your case? But this will never generate anything like tim_txcrd_z just by itself. Receiving side is always responsible for freeing the buffers. Which means initiator (host) is doing this on reads and target (storage) is doing this on writes.
03-10-2017 04:08 AM
I agree with Alexey - the buffer credit zero 20% seen on the ISL from the core switch with the storage towards the edge switches is caused by the speed mismatch between the 16Gb storage to 8Gb host. Check out the F-ports on the edge switches for buffer credit to zero indications.
For the AMP, are you VTAP enabled on all switches. And the are the alerts for first respone time generated / logged for the vtap on the core switch (that would indicate the storage is slow - dependent on your range) or on the edge switch? If alert is only logged on the edge switch VTAP, then fabric latency (via ISL) is a possible option - if you are are monitoring via VTAP on both edge and core switch, you can look at the fabric latency figure to determine where (if any) delay is located - host, fabric, or storage.