Management Software

Reply
Contributor
Posts: 53
Registered: ‎06-24-2009

Problem with ISLs (tim_rdy_pri & tim_txcrd_z)

Hello,

I am seeing strange behaviour on some ISLs. We have (on each fabric) a (core switch) 4100 running FOS 6.3.0b connected to two 3900 switches running FOS 5.3.2c and other 4100 and 4900 switches with 6.3.0b. The problem only appears on the ISLs between the core 4100 and the 3900 switches.

There is a trunked ISL of 4 Gb between the core 4100 and each 3900. As seen on the core switch with "trunkshow -perf", the trunk is never used at more than 10% of capacity (typically, use is between 3 and 6%).

6: 21-> 1 10:00:00:60:xx:xx:xx:xx 42 deskew 15 MASTER

   17-> 0 10:00:00:60:xx:xx:xx:xx 42 deskew 16

   Tx: Bandwidth 4.00Gbps, Throughput 322.65Mbps (9.39%)

   Rx: Bandwidth 4.00Gbps, Throughput 6.92Mbps (0.20%)

On each of the ports on the 4100 corresponding to the ISLs to the 3900's, the tim_txcrd_z count is very high and grows continually.



dc0_4100_b_52:admin> portstatsshow 17



stat_wtx             1214429676   4-byte words transmitted

stat_wrx             62119400     4-byte words received

stat_ftx             2420907      Frames transmitted

stat_frx             383731       Frames received

stat_c2_frx          0            Class 2 frames received

stat_c3_frx          383702       Class 3 frames received

stat_lc_rx           16           Link control frames received

stat_mc_rx           0           Multicast frames received

stat_mc_to           0           Multicast timeouts

stat_mc_tx           0           Multicast frames transmitted

tim_rdy_pri          0           Time R_RDY high priority

tim_txcrd_z          16682205    Time BB credit zero (2.5Us ticks)

er_enc_in            0           Encoding errors inside of frames

er_crc               0           Frames with CRC errors

er_trunc             0           Frames shorter than minimum

er_toolong           0           Frames longer than maximum

er_bad_eof           0           Frames with bad end-of-frame

er_enc_out           0           Encoding error outside of frames

er_bad_os            0           Invalid ordered set

er_rx_c3_timeout     0           Class 3 receive frames discarded due to timeout

er_c3_dest_unreach   0           Class 3 frames discarded due to destination unreachable

er_other_discard     0           Other discards

er_zone_discard      0           Class 3 frames discarded due to zone mismatch

er_crc_good_eof      0           Crc error with good eof

er_inv_arb           0           Invalid ARB

open                 0           loop_open

transfer             0           loop_transfer

opened               0           FL_Port opened

starve_stop          0           tenancies stopped due to starvation

fl_tenancy           0           number of times FL has the tenancy

nl_tenancy           0           number of times NL has the tenancy

zero_tenancy         0           zero tenancy

On the corresponding ports on the 3900 switches, it is the tim_rdy_pri that is even higher and also grows continually.



dc5_3900_b_42:admin> portstatsshow 0



stat_wtx      62377147    4-byte words transmitted

stat_wrx      1229146632  4-byte words received

stat_ftx      387321      Frames transmitted

stat_frx      2450279     Frames received

stat_c2_frx   0           Class 2 frames received

stat_c3_frx   2450278     Class 3 frames received

stat_lc_rx    1           Link control frames received

stat_mc_rx    0           Multicast frames received

stat_mc_to    0           Multicast timeouts

stat_mc_tx    0           Multicast frames transmitted

tim_rdy_pri   98432349    Time R_RDY high priority

tim_txcrd_z   0           Time BB_credit zero

er_enc_in     0           Encoding errors inside of frames

er_crc        0           Frames with CRC errors

er_trunc      0           Frames shorter than minimum

er_toolong    0           Frames longer than maximum

er_bad_eof    0           Frames with bad end-of-frame

er_enc_out    0           Encoding error outside of frames

er_disc_c3    0           Class 3 frames discarded

open          0           loop_open

transfer      0           loop_transfer

opened        0           FL_Port opened

starve_stop   0           tenancies stopped due to starvation

fl_tenancy    0           number of times FL has the tenancy

nl_tenancy    0           number of times NL has the tenancy

The port configurations are all standard (default) except that the speed is fixed to 2 Gb (as that is all 3900's support). There are 26 buffer credits on each of the ports. The fibre connections between the switches are less than 30 metres long.

Could this indicate slow-drain nodes on the edge 3900's? Any other ideas?

Thanks,

Alastair







Contributor
Posts: 53
Registered: ‎06-24-2009

Re: Problem with ISLs (tim_rdy_pri & tim_txcrd_z)

I forgot to mention that porterrshow indicates no other errors or problems on any of the ports.

Frequent Contributor
Posts: 76
Registered: ‎04-17-2010

Re: Problem with ISLs (tim_rdy_pri & tim_txcrd_z)

Hi,


That sounds indeed like a slow drain device. Do you actually have a performance issue or are you just concerned with those counters.


At any rate, the Rx/Tx counters rotate at 9.9g, so it would probably make sense to reset the counters ("statsclear" or per port "portstatsclear") and then monitor the BB_Credit zero values over a short period of time, i.e. take the delta every hour for some time.


Per se it is quite normal to see that counter increase, the question is at what rate, i.e. as long as you don't see disc c3 frames due to congestion, it shouldn't really matter too much.


As ballpark figure: if you see an increase of 400.000 in one second, you _know_ that nothing is getting through, since it counts at most 1 in 2.5us. So, 200.000 / s would mean that the adapter is spending half of its time waiting for R_RDY's.

As to the ISL, make sure that attenuation / jitter are not an issue (you can check "sfpshow" Rx and Tx power for example to calculate the cable attenuation), but if there are no hardware errors, then that's likely not it.


Another thing is that many small frames can also lead to credit exhaustion. Just a thought though.


Hope this helps a bit, you may want to post the results of observing after having reset the counters here, so we can have a look.


Cheers,


crs


Contributor
Posts: 53
Registered: ‎06-24-2009

Re: Problem with ISLs (tim_rdy_pri & tim_txcrd_z)

Hi Christophe,

Thanks for a great reply. Peut-etre faudrait-il parler francais, vu votre nom ? :-)

Yes, I had done portstatsclear on both switches before getting the results I posted. They were about five minutes after the clear. I am currently seeing the tim_rdy_pri grow by about 300,000 a second which is far too high. The BB credit zero grows more slowly, about 50,000 a second.

I don't have a performance issue but am seeing a few SCSI timeouts on some of the servers. I should have mentioned that the disk array (on the core switch) is at 4 Gb and some of the server HBAs are at 1 Gb. Before last Friday, the disk array ports were also at 2 Gb and the SAN only had the 3900's. Now, two SANs have been merged, hence the increase in throughput rate of the array ports. I tried forcing the ports connected to the array ports back to 2 Gb but that made no apparent difference.

Could you explain just what the tim_rdy_pri counter measures?  I know what it says in the doc but that doesn't explain when this counter is augmented. Does it mean that the port wants to send an R_RDY but can't because it is waiting for an acknowledge from the node (server)? As the counter is measuring when priority to send R_RDY's is high, what is stopping the port from sending them?

I also thought about small frame sizes but I can't change the applications :-(

Thanks again,

Alastair

Frequent Contributor
Posts: 76
Registered: ‎04-17-2010

Re: Problem with ISLs (tim_rdy_pri & tim_txcrd_z)

Salut Alastair,


Je veux bien qu'on continue en francais, mais pas grand monde nous comprendra


So I'll continue in english: as to the "tim_rdy_pri", here's what I pulled out of some Brocade documentation:


--snip--

Amount of time that sending R_RDY or VC_RDY primitive signals is
a higher priority than sending frames, due to diminishing credit reserves
in the transmitter at the other end of the fiber.

--snip--


So that pretty much matches what you've got, one one side you're seeing buffer credit starvation and on the other side the switch is sending R_RDY's with high priority, since he's aware of his counterpart running out of credit.


Now there are several things you may want to check:


1) why are we running out of BB_Credits. AFAIK BB_Credit flow control (like any link level flow control) can start pushing back, so you may want to observe whether there's credit starvation on a particular target (or initiator).


2) how exactly are those switches connected, i.e. what's the distance between the sites, what media is used, amount of patch panels, is there a multiplexer in between etc.


3) what is the actual oversubscription of this ISL?


4) with regards to small frames (if that's the case), there's one way to calculate this:


This is assuming that the counters didn't rotate yet. The average Tx and Rx frame size can be calculated as (stat_wtx / stat_ftx; stat_wrx / stat_frx). If the result is not in the range of 2112 bytes (assuming this was done during bulk transfer) you may experience performance issues as the switch becomes buffercredit limited (tim_txcrd_z > 10% or so of stat_ftx). Note that typically BB_Credits are calculated with the assumption that you're actually sending full sized frames. The reality is that dependingly on your blocking size, say with 4k block size, the average frame size may be way below 2kB.


One way to improve this is by allocating more buffers to the E_Port, for example:


switch:admin> portcfglongdistance <port#> LS <higher than physical distance>)


This is assuming you're having an extended fabric license, else you can only move from L0 to LE for example. Check the "portbuffershow" and "portshow <port#>" output to verify the settings are properly applied.


Last but not least, 50.000 tim_txcrd_z / s is not good, but it shouldn't either cause any disruptions, as I said before, if you don't see disc c3 frames resulting from it, I'd say you're doing pretty ok. Maybe it's just time to say goodbye to those old 3900's and get some decent 8GFC switches in there


Hope this helps.

Super Contributor
Posts: 425
Registered: ‎03-03-2010

Re: Problem with ISLs (tim_rdy_pri & tim_txcrd_z)

these two parameters says :

tim_rdy_pri    The number of times  that  sending  R_RDY  or
                    VC_RDY  primitive signals was a higher prior-
                    ity than sending frames, due  to  diminishing
                    credit  reserves  in  the  transmitter at the
                    other end of the  fibre.  This  parameter  is
                    sampled at intervals of 1.8Us (microseconds),
                    and the counter is incremented by  1  if  the
                    condition is  true.

     tim_txcrd_z    The  number of times that the port was unable
                    to transmit frames because  the  transmit  BB
                    credit  was zero. The purpose of this statis-
                    tic is to detect congestion or a  slow  drain
                    device.  This  parameter is sampled at inter-
                    vals of 2.5Us (microseconds), and the counter
                    is  incremented  if  the  condition  is true.

I am sure if tim_txcrd_z value is increasing it points to a slow drain device. you have to check from OS side of the server whether queueing is there or not. Check the q-depth value of the server, from where SCSI cmd time out is coming.SCSI cmnd timeout is definitely for increment of this value. But as this is an ISL, pls try to shift the same to other sw port and try it. I think SWs speed should be set to AN.You can change the SFP also for both the ports.what is the output of portbuffershow.is there any encout or any other parameters which points to physical media.what are the E_D_tov and R_A_Tov value that are set?

If buffer credit is a problem,You can try to disable the other ports in that quad , thus by providing more buffer to the E-port.3900 is definitely old but still it is used in many environments as it has 32 ports and even if you are saying that you have 1 Gbps devices, connected to SAN, better take those to 2 Gbps atleast.

Frequent Contributor
Posts: 76
Registered: ‎04-17-2010

Re: Problem with ISLs (tim_rdy_pri & tim_txcrd_z)

There's an interesting post here, where you can see how they calculated the average frame size and what amount of buffer credits they may require:

http://blogs.brocade.com/home/thread/2450;jsessionid=6A741B71741FCF782F6700450E56931B

Regular Contributor
Posts: 201
Registered: ‎11-24-2009

Re: Problem with ISLs (tim_rdy_pri & tim_txcrd_z)

as BB_Credit mechanism works on per-link basis and you mention that ISL is not utilized too much, it is likely that actual congestion point is not that ISL, but some other segment. Remember the classic example when an old 1G tape device eats up all the ISL credits because it simply can't handle frames sent by 8G host fast enough.

You should really check tim_tx_crd_z on other ports as well to understand which device consumes credits.

Cheers,

Contributor
Posts: 53
Registered: ‎06-24-2009

Re: Problem with ISLs (tim_rdy_pri & tim_txcrd_z)

Thanks all for your replies. I will try to address them one by one.

Christophe,

I had seen the definition of the counter but am still not sure how it works. How does the port on the 3900 know that the port on the 4100 is out of credit buffers? I like the idea but unless there is an ordered set to inform the recipient port (on the 3900 in this case) of the problem, how does it know? Or is it just basing itself on own use of buffers waiting for the port to the server to acknowledge?

As to your numbered points:

1) Unfortunately portstatsshow on the 3900 always says "N/A" for tim_txcrd_z on F-ports. However, I agree with everyone that it would appear to be a slow-drain problem. In which case, increasing the buffer credits would only make matters worse as the switches would then have to buffere even more data.

2) As I said in my initial post, the cables between the switches are probably no more than 30 metres long. They do, however, pass through two patch panels. No multiplexer involved.

3) Theoretical oversubscription is 6.5:1 which is less than the "accepted" norm of 7:1. Furthermore, a lot of the servers produce extremely little IO. As I said, the ISL appears never to at more than 10% of capacity.

4) The average frame size looks to be between 300 and 500 bytes which it hardly spectacular as far as optimisation goes.

Alastair

Contributor
Posts: 53
Registered: ‎06-24-2009

Re: Problem with ISLs (tim_rdy_pri & tim_txcrd_z)

Hemant,

Thanks for the info on these counters. It is interesting to note that the interval frequencies are different.

There are no other errors, as reported by porterrshow. As I said, there are 26 buffers on each of the eight ports in question. The timeout values are all the default ones. I am sure it is not a hadware problems as that would be an amazing coincidence if all eight ports showed the same symptoms at the same time.

Why do you suggest setting the ports to AN? I normally leave F-ports on AN and hard set ISL ports. From my experience, Brocade's best practice on this matter depends on who you ask.

I don't think increasing buffer credits would be a good idea as that would mean even more data would be transferreing between the switches and thus presumably the slow drain problem would increase.

I have already asked the Unix team to try to find 2 Gb HBAs for their servers. There's not much more I can do on that front.

Alastair

Join the Community

Get quick and easy access to valuable resource designed to help you manage your Brocade Network.