Ethernet Switches & Routers

Reply
Occasional Contributor
Posts: 7
Registered: ‎08-09-2012

BI-4XG high CPU load

Hello All.

I'm having a problem with an RX4 / BI-4XG.  The load on the LP sits at a constant 20-30 with spikes to 70-100, dropping packets increasing latency when it gets there.  This RX has 2 4XGs, one with the connection to the upstream router and the other with  connections to superX's handling about 300 vlans.  The card with the problem is the one connecting to the upstream router.

I've been able to do a "show packet capture" on the lp and narrow down some offending traffic.  Filtering it or null routing the server helps bring the CPU utilization down to the teens but then other traffic will take it's place.  it's been a constant fight.

The servers on this switch are not new, they've been on BI4000s for awhile (jetcore) and had no problems.  We are in the process of replacing our older Foundry equipment with current Brocade models but are, so far, getting hammered on the CPU utilization issues.  We actually moved one server that kept showing up back to a BI4000 and it's fine.

Any ideas would be great.  Any diagnostic processes I can use would be great.  I'm thinking maybe I have a bad card.  Has anyone seen this before with bad cards?  The CPU utilization on the card with the VLANs hangs around 3-4.  I would expect that to be higher.  Maybe I've got a cache setting to small although I can't find anything that sticks out.

Super Contributor
Posts: 1,087
Registered: ‎12-13-2009

Re: BI-4XG high CPU load

Hi aaron4,

     that card has two packet process (ports 1 and 2, 3 and 4), try moving the port to a different number and see if the issue is still there.  Also you can rconsole into the line card and check stats from their.

Thanks

Michael.

Occasional Contributor
Posts: 7
Registered: ‎08-09-2012

Re: BI-4XG high CPU load

Hi Michael,

Thanks for the reply.  I moved the lines from ports 1 and 2 to ports 3 and 4.  no change.

When I rcon into the line card, the only command that really gives me any info is a debug packet cap.  Unfortunately the output, while sometimes giving me a clue when there's a lot of packets crossing the card, doesn't normally give me a good idea.  On older Foundry devices I can use a dm raw and get a good idea but it doesn't seem to work on an RX.

What other commands from the LP would you suggest to get a better idea about what's hitting the CPU?

Aaron

Super Contributor
Posts: 1,087
Registered: ‎12-13-2009

Re: BI-4XG high CPU load

Mate,

     suggest you look for CRC errors on the port using show int from inside the rconsole.

Also you might want to check the fabric out using

“Show snm-links by-snm all”

“show snm-links by-lp all”

“dm rw-snm all all get-link-status ”

“show tm stat all”

“show tm non-empty-queues”


This commands are applicable from 2.600c code onwards - not sure what version you are running. 


“show sysmon log”

“Show sysmon counters snm all”

“show sysmon counters lp all”

“Show system fabric-errors”


Hope that helps some, however if you have support, I suggest you log a TAC call.


Thanks

Michael.

Occasional Contributor
Posts: 7
Registered: ‎08-09-2012

Re: BI-4XG high CPU load

Here is my sh int.  I don't see any errors.

telnet@DIST-1.1#sh int e 1/1

10GigabitEthernet1/1 is up, line protocol is up

  Hardware is 10GigabitEthernet, address is 000c.dbf4.4300 (bia 000c.dbf4.4300)

  Configured speed 10Gbit, actual 10Gbit, configured duplex fdx, actual fdx

  Configured mdi mode AUTO, actual MDI

  Member of L2 VLAN ID 4000, port is untagged, port state is Forwarding

  STP configured to ON, Priority is level0, flow control enabled

  Force-DSCP disabled

  mirror disabled, monitor disabled

  Member of active trunk ports 1/1-1/2, primary port

  Member of configured trunk ports 1/1-1/2, primary port

  No port name

  MTU 1518 bytes, encapsulation ethernet

  300 second input rate: 66481166 bits/sec, 21413 packets/sec, 0.68% utilization

  300 second output rate: 169062220 bits/sec, 23677 packets/sec, 1.71% utilization

  36930037964 packets input, 9822716554193 bytes, 0 no buffer

  Received 269270 broadcasts, 585 multicasts, 36929768109 unicasts

  0 input errors, 0 CRC, 0 frame, 0 ignored

  0 runts, 0 giants, DMA received 36930037964 packets

  44180008867 packets output, 38569744198087 bytes, 0 underruns

  Transmitted 88 broadcasts, 12116 multicasts, 44179996663 unicasts

  0 output errors, 0 collisions, DMA transmitted 44180008867 packets

telnet@DIST-1.1#sh int e 1/2

10GigabitEthernet1/2 is up, line protocol is up

  Hardware is 10GigabitEthernet, address is 000c.dbf4.4300 (bia 000c.dbf4.4300)

  Configured speed 10Gbit, actual 10Gbit, configured duplex fdx, actual fdx

  Configured mdi mode AUTO, actual MDI

  Member of L2 VLAN ID 4000, port is untagged, port state is Forwarding

  STP configured to ON, Priority is level0, flow control enabled

  Force-DSCP disabled

  mirror disabled, monitor disabled

  Member of active trunk ports 1/1-1/2, secondary port, primary port is 1/1

  Member of configured trunk ports 1/1-1/2, secondary port, primary port is 1/1

  No port name

  MTU 1518 bytes, encapsulation ethernet

  300 second input rate: 73650750 bits/sec, 20534 packets/sec, 0.75% utilization

  300 second output rate: 138829976 bits/sec, 20964 packets/sec, 1.40% utilization

  37762810593 packets input, 10921842269090 bytes, 0 no buffer

  Received 8546494 broadcasts, 23490 multicasts, 37754240609 unicasts

  0 input errors, 0 CRC, 0 frame, 0 ignored

  0 runts, 0 giants, DMA received 37762810593 packets

  39336117431 packets output, 32937875113997 bytes, 0 underruns

  Transmitted 5 broadcasts, 14 multicasts, 39336117412 unicasts

  0 output errors, 0 collisions, DMA transmitted 39336117431 packets

Here are the results of the fabric commands:

telnet@DIST-1.1#sh snm-links by-snm all

SNM 1/FE1/Link 2 -- Slot 1/FAP3/Link 8 (N): up

SNM 1/FE1/Link 4 -- Slot 1/FAP4/Link 8 (N): up

SNM 1/FE1/Link 6 -- Slot 2/FAP3/Link 8 (N): up

SNM 1/FE1/Link 8 -- Slot 2/FAP4/Link 8 (N): up

SNM 1/FE1/Link10 -- Slot 1/FAP1/Link 8 (N): up

SNM 1/FE1/Link12 -- Slot 1/FAP2/Link 8 (N): up

SNM 1/FE1/Link14 -- Slot 2/FAP1/Link 8 (N): up

SNM 1/FE1/Link16 -- Slot 2/FAP2/Link 8 (N): up

SNM 1/FE1/Link17 -- Slot 2/FAP2/Link 9 (N): up

SNM 1/FE1/Link19 -- Slot 2/FAP1/Link 9 (N): up

SNM 1/FE1/Link21 -- Slot 1/FAP2/Link 9 (N): up

SNM 1/FE1/Link23 -- Slot 1/FAP1/Link 9 (N): up

SNM 1/FE1/Link25 -- Slot 2/FAP4/Link 9 (N): up

SNM 1/FE1/Link27 -- Slot 2/FAP3/Link 9 (N): up

SNM 1/FE1/Link29 -- Slot 1/FAP4/Link 9 (N): up

SNM 1/FE1/Link31 -- Slot 1/FAP3/Link 9 (N): up

SNM 1/FE1/Link34 -- Slot 1/FAP3/Link 4 (A): up

SNM 1/FE1/Link36 -- Slot 1/FAP4/Link 4 (A): up

SNM 1/FE1/Link38 -- Slot 2/FAP3/Link 4 (A): up

SNM 1/FE1/Link40 -- Slot 2/FAP4/Link 4 (A): up

SNM 1/FE1/Link42 -- Slot 1/FAP1/Link 4 (A): up

SNM 1/FE1/Link44 -- Slot 1/FAP2/Link 4 (A): up

SNM 1/FE1/Link46 -- Slot 2/FAP1/Link 4 (A): up

SNM 1/FE1/Link48 -- Slot 2/FAP2/Link 4 (A): up

SNM 1/FE1/Link49 -- Slot 2/FAP2/Link 5 (A): up

SNM 1/FE1/Link51 -- Slot 2/FAP1/Link 5 (A): up

SNM 1/FE1/Link53 -- Slot 1/FAP2/Link 5 (A): up

SNM 1/FE1/Link55 -- Slot 1/FAP1/Link 5 (A): up

SNM 1/FE1/Link57 -- Slot 2/FAP4/Link 5 (A): up

SNM 1/FE1/Link59 -- Slot 2/FAP3/Link 5 (A): up

SNM 1/FE1/Link61 -- Slot 1/FAP4/Link 5 (A): up

SNM 1/FE1/Link63 -- Slot 1/FAP3/Link 5 (A): up

SNM 2/FE1/Link 2 -- Slot 1/FAP3/Link 6 (N): up

SNM 2/FE1/Link 4 -- Slot 1/FAP4/Link 6 (N): up

SNM 2/FE1/Link 6 -- Slot 2/FAP3/Link 6 (N): up

SNM 2/FE1/Link 8 -- Slot 2/FAP4/Link 6 (N): up

SNM 2/FE1/Link10 -- Slot 1/FAP1/Link 6 (N): up

SNM 2/FE1/Link12 -- Slot 1/FAP2/Link 6 (N): up

SNM 2/FE1/Link14 -- Slot 2/FAP1/Link 6 (N): up

SNM 2/FE1/Link16 -- Slot 2/FAP2/Link 6 (N): up

SNM 2/FE1/Link17 -- Slot 2/FAP2/Link 7 (N): up

SNM 2/FE1/Link19 -- Slot 2/FAP1/Link 7 (N): up

SNM 2/FE1/Link21 -- Slot 1/FAP2/Link 7 (N): up

SNM 2/FE1/Link23 -- Slot 1/FAP1/Link 7 (N): up

SNM 2/FE1/Link25 -- Slot 2/FAP4/Link 7 (N): up

SNM 2/FE1/Link27 -- Slot 2/FAP3/Link 7 (N): up

SNM 2/FE1/Link29 -- Slot 1/FAP4/Link 7 (N): up

SNM 2/FE1/Link31 -- Slot 1/FAP3/Link 7 (N): up

SNM 2/FE1/Link34 -- Slot 1/FAP3/Link 3 (A): up

SNM 2/FE1/Link36 -- Slot 1/FAP4/Link 3 (A): up

SNM 2/FE1/Link38 -- Slot 2/FAP3/Link 3 (A): up

SNM 2/FE1/Link40 -- Slot 2/FAP4/Link 3 (A): up

SNM 2/FE1/Link42 -- Slot 1/FAP1/Link 3 (A): up

SNM 2/FE1/Link44 -- Slot 1/FAP2/Link 3 (A): up

SNM 2/FE1/Link46 -- Slot 2/FAP1/Link 3 (A): up

SNM 2/FE1/Link48 -- Slot 2/FAP2/Link 3 (A): up

telnet@DIST-1.1#sh snm-links by-lp all
Slot 1/FAP1/Link 3 (A)-- SNM2/FE1/Link42 : up
Slot 1/FAP1/Link 4 (A)-- SNM1/FE1/Link42 : up
Slot 1/FAP1/Link 5 (A)-- SNM1/FE1/Link55 : up
Slot 1/FAP1/Link 6 (N)-- SNM2/FE1/Link10 : up
Slot 1/FAP1/Link 7 (N)-- SNM2/FE1/Link23 : up
Slot 1/FAP1/Link 8 (N)-- SNM1/FE1/Link10 : up
Slot 1/FAP1/Link 9 (N)-- SNM1/FE1/Link23 : up

Slot 1/FAP2/Link 3 (A)-- SNM2/FE1/Link44 : up
Slot 1/FAP2/Link 4 (A)-- SNM1/FE1/Link44 : up
Slot 1/FAP2/Link 5 (A)-- SNM1/FE1/Link53 : up
Slot 1/FAP2/Link 6 (N)-- SNM2/FE1/Link12 : up
Slot 1/FAP2/Link 7 (N)-- SNM2/FE1/Link21 : up
Slot 1/FAP2/Link 8 (N)-- SNM1/FE1/Link12 : up
Slot 1/FAP2/Link 9 (N)-- SNM1/FE1/Link21 : up

Slot 1/FAP3/Link 3 (A)-- SNM2/FE1/Link34 : up
Slot 1/FAP3/Link 4 (A)-- SNM1/FE1/Link34 : up
Slot 1/FAP3/Link 5 (A)-- SNM1/FE1/Link63 : up
Slot 1/FAP3/Link 6 (N)-- SNM2/FE1/Link 2 : up
Slot 1/FAP3/Link 7 (N)-- SNM2/FE1/Link31 : up
Slot 1/FAP3/Link 8 (N)-- SNM1/FE1/Link 2 : up
Slot 1/FAP3/Link 9 (N)-- SNM1/FE1/Link31 : up

Slot 1/FAP4/Link 3 (A)-- SNM2/FE1/Link36 : up
Slot 1/FAP4/Link 4 (A)-- SNM1/FE1/Link36 : up
Slot 1/FAP4/Link 5 (A)-- SNM1/FE1/Link61 : up
Slot 1/FAP4/Link 6 (N)-- SNM2/FE1/Link 4 : up
Slot 1/FAP4/Link 7 (N)-- SNM2/FE1/Link29 : up
Slot 1/FAP4/Link 8 (N)-- SNM1/FE1/Link 4 : up
Slot 1/FAP4/Link 9 (N)-- SNM1/FE1/Link29 : up

Slot 2/FAP1/Link 3 (A)-- SNM2/FE1/Link46 : up
Slot 2/FAP1/Link 4 (A)-- SNM1/FE1/Link46 : up
Slot 2/FAP1/Link 5 (A)-- SNM1/FE1/Link51 : up
Slot 2/FAP1/Link 6 (N)-- SNM2/FE1/Link14 : up
Slot 2/FAP1/Link 7 (N)-- SNM2/FE1/Link19 : up
Slot 2/FAP1/Link 8 (N)-- SNM1/FE1/Link14 : up
Slot 2/FAP1/Link 9 (N)-- SNM1/FE1/Link19 : up

Slot 2/FAP2/Link 3 (A)-- SNM2/FE1/Link48 : up
Slot 2/FAP2/Link 4 (A)-- SNM1/FE1/Link48 : up
Slot 2/FAP2/Link 5 (A)-- SNM1/FE1/Link49 : up
Slot 2/FAP2/Link 6 (N)-- SNM2/FE1/Link16 : up
Slot 2/FAP2/Link 7 (N)-- SNM2/FE1/Link17 : up
Slot 2/FAP2/Link 8 (N)-- SNM1/FE1/Link16 : up
Slot 2/FAP2/Link 9 (N)-- SNM1/FE1/Link17 : up

Slot 2/FAP3/Link 3 (A)-- SNM2/FE1/Link38 : up
Slot 2/FAP3/Link 4 (A)-- SNM1/FE1/Link38 : up
Slot 2/FAP3/Link 5 (A)-- SNM1/FE1/Link59 : up
Slot 2/FAP3/Link 6 (N)-- SNM2/FE1/Link 6 : up
Slot 2/FAP3/Link 7 (N)-- SNM2/FE1/Link27 : up
Slot 2/FAP3/Link 8 (N)-- SNM1/FE1/Link 6 : up
Slot 2/FAP3/Link 9 (N)-- SNM1/FE1/Link27 : up

Slot 2/FAP4/Link 3 (A)-- SNM2/FE1/Link40 : up
Slot 2/FAP4/Link 4 (A)-- SNM1/FE1/Link40 : up
Slot 2/FAP4/Link 5 (A)-- SNM1/FE1/Link57 : up
Slot 2/FAP4/Link 6 (N)-- SNM2/FE1/Link 8 : up
Slot 2/FAP4/Link 7 (N)-- SNM2/FE1/Link25 : up
Slot 2/FAP4/Link 8 (N)-- SNM1/FE1/Link 8 : up
Slot 2/FAP4/Link 9 (N)-- SNM1/FE1/Link25 : up

telnet@DIST-1.1#dm rw-snm all all get-link-status

SNM0/FE0:Link1: Sig Lock Yes, Leaky Bucket 63

SNM0/FE0:Link3: Sig Lock Yes, Leaky Bucket 63

SNM0/FE0:Link5: Sig Lock Yes, Leaky Bucket 63

SNM0/FE0:Link7: Sig Lock Yes, Leaky Bucket 63

SNM0/FE0:Link9: Sig Lock Yes, Leaky Bucket 63

SNM0/FE0:Link11: Sig Lock Yes, Leaky Bucket 63

SNM0/FE0:Link13: Sig Lock Yes, Leaky Bucket 63

SNM0/FE0:Link15: Sig Lock Yes, Leaky Bucket 63

SNM0/FE0:Link16: Sig Lock Yes, Leaky Bucket 63

SNM0/FE0:Link18: Sig Lock Yes, Leaky Bucket 63

SNM0/FE0:Link20: Sig Lock Yes, Leaky Bucket 63

SNM0/FE0:Link22: Sig Lock Yes, Leaky Bucket 63

SNM0/FE0:Link24: Sig Lock Yes, Leaky Bucket 63

SNM0/FE0:Link26: Sig Lock Yes, Leaky Bucket 63

SNM0/FE0:Link28: Sig Lock Yes, Leaky Bucket 63

SNM0/FE0:Link30: Sig Lock Yes, Leaky Bucket 63

SNM0/FE0:Link33: Sig Lock Yes, Leaky Bucket 63

SNM0/FE0:Link35: Sig Lock Yes, Leaky Bucket 63

SNM0/FE0:Link37: Sig Lock Yes, Leaky Bucket 63

SNM0/FE0:Link39: Sig Lock Yes, Leaky Bucket 63

SNM0/FE0:Link41: Sig Lock Yes, Leaky Bucket 63

SNM0/FE0:Link43: Sig Lock Yes, Leaky Bucket 63

SNM0/FE0:Link45: Sig Lock Yes, Leaky Bucket 63

SNM0/FE0:Link47: Sig Lock Yes, Leaky Bucket 63

SNM0/FE0:Link48: Sig Lock Yes, Leaky Bucket 63

SNM0/FE0:Link50: Sig Lock Yes, Leaky Bucket 63

SNM0/FE0:Link52: Sig Lock Yes, Leaky Bucket 63

SNM0/FE0:Link54: Sig Lock Yes, Leaky Bucket 63

SNM0/FE0:Link56: Sig Lock Yes, Leaky Bucket 63

SNM0/FE0:Link58: Sig Lock Yes, Leaky Bucket 63

SNM0/FE0:Link60: Sig Lock Yes, Leaky Bucket 63

SNM0/FE0:Link62: Sig Lock Yes, Leaky Bucket 63

SNM1/FE0:Link1: Sig Lock Yes, Leaky Bucket 63

SNM1/FE0:Link3: Sig Lock Yes, Leaky Bucket 63

SNM1/FE0:Link5: Sig Lock Yes, Leaky Bucket 63

SNM1/FE0:Link7: Sig Lock Yes, Leaky Bucket 63

SNM1/FE0:Link9: Sig Lock Yes, Leaky Bucket 63

SNM1/FE0:Link11: Sig Lock Yes, Leaky Bucket 63

SNM1/FE0:Link13: Sig Lock Yes, Leaky Bucket 63

SNM1/FE0:Link15: Sig Lock Yes, Leaky Bucket 63

SNM1/FE0:Link16: Sig Lock Yes, Leaky Bucket 63

SNM1/FE0:Link18: Sig Lock Yes, Leaky Bucket 63

SNM1/FE0:Link20: Sig Lock Yes, Leaky Bucket 63

SNM1/FE0:Link22: Sig Lock Yes, Leaky Bucket 63

SNM1/FE0:Link24: Sig Lock Yes, Leaky Bucket 63

SNM1/FE0:Link26: Sig Lock Yes, Leaky Bucket 63

SNM1/FE0:Link28: Sig Lock Yes, Leaky Bucket 63

SNM1/FE0:Link30: Sig Lock Yes, Leaky Bucket 63

SNM1/FE0:Link33: Sig Lock Yes, Leaky Bucket 63

SNM1/FE0:Link35: Sig Lock Yes, Leaky Bucket 63

SNM1/FE0:Link37: Sig Lock Yes, Leaky Bucket 63

SNM1/FE0:Link39: Sig Lock Yes, Leaky Bucket 63

SNM1/FE0:Link41: Sig Lock Yes, Leaky Bucket 63

SNM1/FE0:Link43: Sig Lock Yes, Leaky Bucket 63

SNM1/FE0:Link45: Sig Lock Yes, Leaky Bucket 63

SNM1/FE0:Link47: Sig Lock Yes, Leaky Bucket 63

I'm not seeing anything abnormal.

I don't have a support contract.  Years ago I used to be an Extreme Networks shop and if you didn't have a support contract with them you could still open a ticket and they would charge you on a per ticket basis.  It wasn't cheap.  I did it a couple times and it was $500 per ticket.  Does Brocade have anything like that?

Occasional Contributor
Posts: 7
Registered: ‎08-09-2012

Re: BI-4XG high CPU load

In my travels I found this:

LP-1#sh ip nexthop

Paths  Total  Free  In-use
  1    2816   2      2814
  2    512    0      512
  4    512    0      512
  8    256    0      256

All my other RX's look like this or similar:

LP-2#sh ip nexthop

Paths  Total  Free  In-use
  1    2816   1169   1647
  2    512    512    0
  4    512    512    0
  8    256    240    16

it looks to me like I'm running out of space on the LP.  There is a cam-partition command to change it but I'm not familiar with it or it's effects.  Any idea if this might be my cause?

Occasional Contributor
Posts: 7
Registered: ‎08-09-2012

Re: BI-4XG high CPU load

I rebooted the router (tough since there are several hundred customers on it) and waitched the next hop counters.  After about 5 minutes they were here:

LP-1#sh ip nexthop

Paths  Total  Free  In-use
  1    2816   0      2816
  2    512    0      512
  4    512    0      512
  8    256    0      256

with 0 free and the CPU started to climb slightly but not significantly.  However, I'd like to find out what's eating that space since it's not happening on any other RX i have and I'm having issues anyway with this box.

Occasional Contributor
Posts: 7
Registered: ‎08-09-2012

Re: BI-4XG high CPU load

So, just to update this thread for anyone else having these issues:

I was running an older software version, 2.4.x---.  Upgrading to 2.7.x--- helped some.  It seems the next hop entries on the card wern't aging out fast enough in 2.4 and it was causing them to pile up and the LP CPU to start to work harder.

2.7 ages them out quicker, I don't know if it's a shorter timer or if it's based on how full the table is getting.  Brocade's online manuals also talk about these tables being adjustable, which they are.  The manual states the upper limit is 4096 total however, when looking at the command help on the router it gives me a number over 8000.  Upon trying to max it out I got errors about other limits so I set it to 5500 for the 1 hop tables which is as high as i could go without a warning.

Doing this, rebooting the router an leaving it for a few days the average LP CPU load is at a comfortable 4-5 where it had been 30-70.

I am very interested in hearing anyone else's exterience and tips for optimizing the RX platform for a larger numbers of directly connected networks.  By larger I mean 3-4000 vlans with 50,000+ IP addresses.

Join the Community

Get quick and easy access to valuable resource designed to help you manage your Brocade Network.

Download FREE NVMe eBook