Fibre Channel (SAN)

Reply
GK
Contributor
Posts: 28
Registered: ‎04-12-2010

CRC Errors on E_Port

I have 2 x 4G FC DWDM Circuits connected to the ports from same ASIC truncked.. connected b/w 2 DC's sharing resources b/w DC's.

Seeing lot n incresing CRC and Invalid_words errors on the E_Ports (inter connected b/w DC's) where DWDM link is connected.

Actions taken :

replace of FC cables on all places,

replaced SFP on the Switch end

checked DWDM link -looks fine

Hardware compatibility b/w SAN switch and MUX

any other options to stop these CRC errors and Invalid_words on port end.

Frequent Contributor
Posts: 140
Registered: ‎02-27-2008

Re: CRC Errors on E_Port

Hi GK

What you have done looks good ,did you also clear the port stats on those E ports.

portstatsclear "portnumber"

Regards

David

GK
Contributor
Posts: 28
Registered: ‎04-12-2010

Re: CRC Errors on E_Port

Yes I do cleared portstats on all those ports.

Super Contributor
Posts: 260
Registered: ‎04-09-2008

Re: CRC Errors on E_Port

CRC errors can traverse through the fabric and even propagate through the DWDM across the DC's. So a port generating CRC errors in one DC can cause errors to be observed on ports across all ISL's in the other.

You will have to look at incrementing counters on all switches, clear stats and keep looking for the culprit port.

Since this is a lot of effort take switches in batches and observe stats on the ISL. Excerpts from a recent post of mine might help.

This tip has been compiled from various sources with inputs from my understanding. Use CLI for troubleshooting port errors preferably.
the output of porterrshow can be bifurcated into 2 areas,
1. Physical layer issues, these originate at the source and can propagate through fabrics.
enc_in: This counter increments when 8b/10b encoding errors are detected within a frame. enc_in errors are always detected on the ingress port.
crc_err: Indicates corruption within the frame. Always seen on ingress port but will be passed by the switch unaltered through the fabric.
enc_in and/or crc_err = Possible bad media (SFP, cable, patch panel)
Bad_eof: After a loss of synchronization error, continuous-mode alignment allows the receiver to re-establish word alignment at any point in the incoming bit stream
while the receiver is operational. If such a re-alignment occurs, detection of the resulting error condition is dependant upon higher level functions (eg: invalid CRC,
missing EOF)
my take if you see bad_eof and crc incrementing, replace SFP a.s.a.p
too_long or too_short errors indicate an unreliable link
enc_out: 8b/10b encoding errors NOT associated with frames (IDLE, R_RDY, and various other primitives). This counter increments during speed negotiation prior to login. Locking a port to a speed supported by the end device can be used to isolate issues.
– Possible bad media (SFP, cable, patch panel)
– Can cause a performance problem due to buffer recovery
disc_c3: Class 3 frame has been discarded because it is not routable to a destination address
– Corrupted or not-online Destination ID (DID)
– Timeout exceeded (Condor ASIC hold time exceeded)
– Counter may increment when FC nodes and/or switches rapidly transition between online and offline; look at fabriclog –s output
2. Link errors point to point - do not traverse fabric.
Link failures - error conditions that cause a port to drop out of an active state
– Requires the reconnecting device to FLOGI back into fabric (No speed negotiation required, since the device does not lose synchronization)
Loss of sync - occur when bit and word synchronization on link is lost
Loss of signal – occur when light or an electrical signal is lost on a link
– Require connected device to renegotiate speed and FLOGI back into fabric
If you experience device connectivity and/or performance issues and rising link counters look for
– bad cables/SFPs/patch-panel connections
– repeating cycles of online/offline states in fabriclog -s output
Once you identify the suspects use, portstats64show and portstatsshow to zero down on the culprit.
If you see errors on an ISL port and want to determine if source of dest is causing the error,
To find out if source or destination SFP is causing the error, Check the Output of "portshow x" where x is the port number.
If the pair of "Lr_in " and "Ols_out " as well the "Lr_out " and "Ols_in " values are "quite" equal, it is a normal case.
If one counter is significantly higher than the other, the link problems either "reached" the switch ("in" > "out") or are caused by the switch ("out" > "in").
Note: If the “Ols_in ? value is higher than the “Lr_out ? one, then the “problem source? is, in most cases, more related to the attached device (sending those offline sequences) and the switch responds to them with a "link reset".

Join the Community

Get quick and easy access to valuable resource designed to help you manage your Brocade Network.

Click to Register
Download FREE NVMe eBook