Fibre Channel (SAN)

Reply
Occasional Contributor
Posts: 8
Registered: ‎01-10-2014

enc_out errors on Inter-site SAN connections

 

We have two DCX8510-4 SAN switches running FOS 7.1.0c at both the local and the remote datacentre sites.

 

All DCX8510-4 SAN switches have FC16-48 Blades fitted.

 

The intersite connections are via Cienna DWDM and ADVA DWDM Chassis.Both are nmanaged services form Vodafone and BT respectively.

 

The intersite links are routed. i.e. we have intergrated routing configured at the local site on both DCX switches.

 

Although we are using FC16-48 Blades they are fitted with 8Gbps SFP+ transceivers. In the ports used for the inter site connections these are fixed at 2Gbps.

 

The Ceinna Circuit is OK with no errors.

 

The ADVA circuit has incrementing enc_out errors.

 

We had BT check their ADVA Chassis and  circuits and they reported no errors.

 

We have inspected and cleaned all cable connections at both sites and the errors persist.

 

I noticed the following in the Brocade SAN Admin Best practice paper (Page 8, fabric Configuration) which seems to fit our scenario:

 

Traffic outside of frame traffic is made up of fill words: IDLEs or ARB (F0) or ARB (FF). Encoding errors on fill words
are generally not considered impactful. This is why you may see very high counts of enc_out (encoding outside of the
frame) and not have customer traffic affected. If many fill words are lost at once, the link may lose synchronization.

 

So my question is "Is it safe to run with these enc_out errors?"

 

External Moderator
Posts: 4,857
Registered: ‎02-23-2004

Re: enc_out errors on Inter-site SAN connections

Thomas,

 

fillword in Gen5 Plattforms is not longer available.

 

have you tried to set the Portspeed FIXED instead as AN ?

TechHelp24
Valued Contributor
Posts: 536
Registered: ‎03-20-2011

Re: enc_out errors on Inter-site SAN connections

as Antonio said, there's no fillword setting on 16G ASIC. but the fillword itself is always being used in communications, of course. however, if we suspect some issues with the fillword, then the error counter that would inrease should normally be er_bad_os.

in the plain SAN the enc_out counter is supposed to be a bad sign and we usually try to investigate and fix the source of these messages. do you see these counters increasing across both long distance connections? in both directions? or maybe you could spot some disbalance between the paths?

another idea that i have is related to some DWDM specifics. these devices are known to break the data stream for compression between the frames. when the arb(ff) fillword became the mainstream with 8G speeds, some of these devices failed to operate because they expected idle primitives to separate the frames. the workaround was to set the fillword to idle. now i'm sure that all the modern DWDM devices are capable to operate with both kinds of the fillword. but anyway, this makes me think that DWDM extracts the data frames (leaving out all the fillwords - it's obvious that you don't want to consume expensive long distance equipment to transmit something of no value) performs the compression and sends the resulting data to the opposite device. the DWDM over there receives the compressed data, extracts the data frames, and in order to place them further down the link, it has to insert the fillwords. FC standard requires at least two fillwords between the frames. who knows, maybe Brocade expects three of them? or maybe the DWDM only inserts one of them? and therefore Brocade detects some inconsistency. i think it will be interesting to insert the FC analyser and look what actually happens between the DWDM and DCX ports...
Occasional Contributor
Posts: 8
Registered: ‎01-10-2014

Re: enc_out errors on Inter-site SAN connections

Antonio,

 

ports are fixed to 2Gbps at both ends on both circuits

 

rgds

 

Tom

Occasional Contributor
Posts: 8
Registered: ‎01-10-2014

Re: enc_out errors on Inter-site SAN connections

Alexey,

 

the incrementing enc_out errors only appear on the ADVA circuit and only in one direction.

 

Also we dont have access to an FC Analyzer.

 

rgds

 

Tom

Occasional Contributor
Posts: 8
Registered: ‎01-10-2014

Re: enc_out errors on Inter-site SAN connections

Alexey,

 

here is some additional info that we collected when we were testing the end-to-end circuit using porttest and loopback connectors.( I have already sent this to Antonio by email)

 

We have done some work testing the connections using porttest and breaking into the circuit and inserting loopback connectors.

We have interpreted the results as indicating that the cause of the enc_out errors lies within the DWDM Circuit between both sites.

I have included our results below. The production location is called Cathcart and the remote location is called Kirkintilloch.

I would be interested to know if you concur with our conclusion the the DWDM circuit is the cause of the problem.

Test Results

Here's a quick summary of the test carried out on the BT circuit and the results observed during the tests

Test 1:

Port 2/27 persistently disabled on switch FSWCATD51 and a loopback connector plugged into the fibre cable that connects to port 2/27 on switch FSWCATD51 (i.e. the farthest end of the link from Kirkintilloch)
          
     All error counters were cleared on switch FSWKIRKD53 prior to executing the test

     The command 'porttest -ports 2/27' was executed on FSWKIRKD53. This command sends 20 test frames to the port and an extract of the error counters for the port (the port index is 91) is displayed below
        
    FSWKIRKD53:e400022> porterrshow| grep 91
 91:   20     20      0      0      0      0      0      0     15      0      0      0      0      0      0      0      0      0

FSWKIRKD53:e400022> porterrshow| grep 91
 91:   20     20      0      0      0      0      0      0     21      0      0      0      0      0      0      0      0      0

As can be seen from the 9th column along, enc_out errors were seen to increment as a result of executing the 'porttest' command

'porttestshow' was also executed and this reported that the test had passed with no errors as below

FSWKIRKD53:e400022> porttestshow -ports 2/27
Port 91 : PASS
PortType: LOOPBACK PORT            PortState: TEST DONE
PortInternalState: INIT                    PortTypeToTest: NO_TEST
Pattern: 0xb            Seed: 0xaa           UserDelay: 10
TotalIteration: 20                 CurrentIteration: 20
TotalFail: 0                       ConsecutiveFail: 0
StartTime: Mon Feb 02 13:29:00 2015
StopTime:  Mon Feb 02 13:29:06 2015
Timeout: 0                         ErrorCode: 0

Test 2:

All error counters were cleared on switch FSWKIRKD53 and port 2/27 was disabled.

The loopback connecter was plugged into the sfp in the ADVA patch panel in Cathcart and port 2/27 was enabled on switch FSWKIRKD53.

The 'porttest -ports 2/27' command was executed on switch FSWKIRKD53 an again, enc_out errors were observed to increment on switch FSWKIRKD53

FSWKIRKD53:e400022> porterrshow| grep 91
 91:   24     24      0      0      0      0      0      0     22      0      0      0      0      0      0      0      0      0

FSWKIRKD53:e400022> porterrshow| grep 91
 91:   24     24      0      0      0      0      0      0     25      0      0      0      0      0      0      0      0      0

The 'porttetsshow' command once again indicated that the test passed with no errors

FSWKIRKD53:e400022> porttestshow -ports 2/27
Port 91 : PASS
PortType: LOOPBACK PORT            PortState: TEST DONE
PortInternalState: INIT                    PortTypeToTest: NO_TEST
Pattern: 0xb            Seed: 0xaa           UserDelay: 10
TotalIteration: 20                 CurrentIteration: 20
TotalFail: 0                       ConsecutiveFail: 0
StartTime: Mon Feb 02 13:36:34 2015
StopTime:  Mon Feb 02 13:36:42 2015
Timeout: 0                         ErrorCode: 0

Test 3:

All error counters were cleared on switch  FSWKIRKD53. Port 2/27 was disabled on switch FSWKIRKD53.

The loopback connector was plugged into the 'attenuating' fibre cable that connects to the sfp in the ADVA patch panel in Kirkintilloch and port 2/27 was enabled on switch FSWKIRKD53.

The 'porttest -ports 2/27' was executed on switch FSWKIRKD53. No 'enc_out' errors were observed on this occasion, as per the extract below

FSWKIRKD53:e400022> porterrshow | grep 91
91:   13     13      0      0      0      0      0      0      0      0      0      0      0      0      0      0      0      0

FSWKIRKD53:e400022> porterrshow | grep 91
91:   24     24      0      0      0      0      0      0      0      0      0      0      0      0      0      0      0      0

The 'porttestshow' output indicated the test had also passed.

FSWKIRKD53:e400022> porttestshow -ports 2/27
Port 91 : PASS
PortType: LOOPBACK PORT            PortState: TEST DONE
PortInternalState: INIT                    PortTypeToTest: NO_TEST
Pattern: 0xb            Seed: 0xaa           UserDelay: 10
TotalIteration: 20                 CurrentIteration: 20
TotalFail: 0                       ConsecutiveFail: 0
StartTime: Tue Feb 03 10:20:15 2015
StopTime:  Tue Feb 03 10:20:23 2015
Timeout: 0                         ErrorCode: 0

Given that no errors were observed testing from just before the ADVA kit back to the switch port in Kirkintilloch and errors were observed when testing from the ADVA kit in Cathcart back to the switch port in Kirkintilloch, this would seem to suggest that the issue lies somewhere in the BT ADVA circuit between the sites.

thanks for you help

best regards

 

Tom

Valued Contributor
Posts: 536
Registered: ‎03-20-2011

Re: enc_out errors on Inter-site SAN connections

i totally agree, in this case you have some kind of error condition between the ADVA devices. interesting is why the errors only appear between the frames. i'd think that this is something logical rather than hardware.

regarding dport tests that show success while error counters increase: we've got a case like that, it was ~6 months ago, and the outcome was that brocade confirmed some defects in the dport tests code, committed to fix them in 7.2. we are still on 7.1 and couldn't confirm if this was really fixed or not.

Join the Community

Get quick and easy access to valuable resource designed to help you manage your Brocade Network.