Fibre Channel (SAN)

Monitoring Brocade Port Load using MRTG/RRDTool


At our side, we are using 2 SANs made of 3*Brocade 200E, 1* 5300 and 1* 4020 SAN Switches,

all 4Gbit GBICS and Qlogic HBAs, all switches running FabricOS 6.1.x.

I started monitoring our SAN Ports for performance, 1 sampling per 5 Minutes.

Queried OIDs :

    swFCPortTxWords Transmit Frames

    swFCPortRxWords Received Frames

To get Data in Bytes, I simply multiple the gathered data by 4.

Up to 20/25MB/Sec Rx or Tx Port load, MRTG/RRDTool graph correctly the Swich Port Load.

When port load is higher than above, MRTG/RRDTool start reporting abnormal performance

compared with what StorageArray and/or Host are reporting :

       Ex: While DDing on raw disk and Storage Array seeing IO Rate at 180MB/Sec,

       MRTG graph 12, 16, 30, 20 MB/Sec.

Same behaviour when I monitor my ISL Links between SAN Switches.

So, no direct link with connected infrastructure to port.

Investingating a lot around it, I discovered the problem is caused by switch counters being reset.

The swFCPortTxWords and swFCPortRxWords values are stored as Counter32, or 2^32.

Assuming  (Avg Speed in bytes * 300 sec sampling) / 4 = Delta Tx or Rx Frames ....

and Counter32 storing up to 4294967295,

At a rate of 100MB/Sec, a 5 minutes Frame Rx or Tx delta is

    (100*1024*1024*300/4) = avg 7864320000 4 Bytes frames

meaning counters are reset nearly twice per 5 min.

At a rate of 50MB/Sec, a 5 minutes Frame Rx or Tx delta is

    (50*1024*1024*300/4) = avg 3932160000 4 Bytes frames

meaning counters are reset nearly once per 5 min.

That said, MRTG correctly see performance reduction and/or handle

reduced counter as an exception and handling as 0 and causing wrong performance board effects.

Using PortStatsShow command, same kind of info stored on Counter32. See sample :

SANSWITCH:admin> portstatsshow 11
stat_wtx             4289468676  4-byte words transmitted
stat_wrx             818304572   4-byte words received

...Wait 10 Secs ...

SANSWITCH:admin> portstatsshow 11
stat_wtx             25652200    4-byte words transmitted

stat_wrx             818329772   4-byte words received


Any idea how to monitor port performance with reliable data ?

Thanks in advance,

Kind regards - Bien cordialement - Vriendelijke groeten,


Backup/Storage & System Management

Re: Monitoring Brocade Port Load using MRTG/RRDTool

Hello Thierry,

You're right 32bits couters are going to be reseted each time they reach the max value

Then you better use "portstats64show"

> portstatsshow 1/10
stat_wtx                2486236280  4-byte words transmitted
stat_wrx                373861032   4-byte words received
stat_ftx                1964791586  Frames transmitted
stat_frx                3432227480  Frames received

> portstats64show 1/10
stat64_wtx      5540        top_int : 4-byte words transmitted
                2492715400  bottom_int : 4-byte words transmitted
stat64_wrx      13021       top_int : 4-byte words received
                373942308   bottom_int : 4-byte words received
stat64_ftx      14          top_int : Frames transmitted
                1964806620  bottom_int : Frames transmitted
stat64_frx      53          top_int : Frames received
                3432231841  bottom_int : Frames received

Hope this will help you



Re: Monitoring Brocade Port Load using MRTG/RRDTool

Hello Christophe,

thanks for your reply. In the meantime, I fixed the problem :

Assuming TX/RX 32bits counter reset takes 32 Secs at 4Gbits to be resetted,

( 2^32 frames of 4 Bytes * 8 Bits) / (4Gigabit port speed) = 32sec, in other

words, a theoretical Port IO Rate of 257MB/Sec, I can safely poll each minute,

comparing value with previously gathered one.

If new>=old -> Ok, no prob, I compute delta=new-old.

If new<old, then I compute delta=(2^32-old)+new.

I store the 5 last computed values for the last 5 min IO interval.

Then I can deduce Avg Speed (Sum of 5 saved 1 minute deltas)/360 Sec.

As far as my port load is lower than 125MB/Sec, counter will never be

reset within a minute, so no prob. If case of perf higher than 125MB/S,

I just have to query each 30 Sec SNMP 32counters in way of 1 minute and

archive 10 collected data in way of 5. And the cat is in the bag. :-)

Thanks again for answer.


