Brocade Fibre Channel Networking Community

Expand all | Collapse all

enc_out errors on Inter-site SAN connections

  • 1.  enc_out errors on Inter-site SAN connections

    Posted 02-02-2015 07:32 AM

     

    We have two DCX8510-4 SAN switches running FOS 7.1.0c at both the local and the remote datacentre sites.

     

    All DCX8510-4 SAN switches have FC16-48 Blades fitted.

     

    The intersite connections are via Cienna DWDM and ADVA DWDM Chassis.Both are nmanaged services form Vodafone and BT respectively.

     

    The intersite links are routed. i.e. we have intergrated routing configured at the local site on both DCX switches.

     

    Although we are using FC16-48 Blades they are fitted with 8Gbps SFP+ transceivers. In the ports used for the inter site connections these are fixed at 2Gbps.

     

    The Ceinna Circuit is OK with no errors.

     

    The ADVA circuit has incrementing enc_out errors.

     

    We had BT check their ADVA Chassis and  circuits and they reported no errors.

     

    We have inspected and cleaned all cable connections at both sites and the errors persist.

     

    I noticed the following in the Brocade SAN Admin Best practice paper (Page 8, fabric Configuration) which seems to fit our scenario:

     

    Traffic outside of frame traffic is made up of fill words: IDLEs or ARB (F0) or ARB (FF). Encoding errors on fill words
    are generally not considered impactful. This is why you may see very high counts of enc_out (encoding outside of the
    frame) and not have customer traffic affected. If many fill words are lost at once, the link may lose synchronization.

     

    So my question is "Is it safe to run with these enc_out errors?"

     


    #BrocadeFibreChannelNetworkingCommunity


  • 2.  Re: enc_out errors on Inter-site SAN connections

    Posted 02-02-2015 09:22 AM

    Thomas,

     

    fillword in Gen5 Plattforms is not longer available.

     

    have you tried to set the Portspeed FIXED instead as AN ?


    #BrocadeFibreChannelNetworkingCommunity


  • 3.  Re: enc_out errors on Inter-site SAN connections

    Posted 02-03-2015 01:59 AM

    Antonio,

     

    ports are fixed to 2Gbps at both ends on both circuits

     

    rgds

     

    Tom


    #BrocadeFibreChannelNetworkingCommunity


  • 4.  Re: enc_out errors on Inter-site SAN connections

    Posted 02-02-2015 09:56 AM
    as Antonio said, there's no fillword setting on 16G ASIC. but the fillword itself is always being used in communications, of course. however, if we suspect some issues with the fillword, then the error counter that would inrease should normally be er_bad_os.

    in the plain SAN the enc_out counter is supposed to be a bad sign and we usually try to investigate and fix the source of these messages. do you see these counters increasing across both long distance connections? in both directions? or maybe you could spot some disbalance between the paths?

    another idea that i have is related to some DWDM specifics. these devices are known to break the data stream for compression between the frames. when the arb(ff) fillword became the mainstream with 8G speeds, some of these devices failed to operate because they expected idle primitives to separate the frames. the workaround was to set the fillword to idle. now i'm sure that all the modern DWDM devices are capable to operate with both kinds of the fillword. but anyway, this makes me think that DWDM extracts the data frames (leaving out all the fillwords - it's obvious that you don't want to consume expensive long distance equipment to transmit something of no value) performs the compression and sends the resulting data to the opposite device. the DWDM over there receives the compressed data, extracts the data frames, and in order to place them further down the link, it has to insert the fillwords. FC standard requires at least two fillwords between the frames. who knows, maybe Brocade expects three of them? or maybe the DWDM only inserts one of them? and therefore Brocade detects some inconsistency. i think it will be interesting to insert the FC analyser and look what actually happens between the DWDM and DCX ports...
    #BrocadeFibreChannelNetworkingCommunity


  • 5.  Re: enc_out errors on Inter-site SAN connections

    Posted 02-03-2015 02:02 AM

    Alexey,

     

    the incrementing enc_out errors only appear on the ADVA circuit and only in one direction.

     

    Also we dont have access to an FC Analyzer.

     

    rgds

     

    Tom


    #BrocadeFibreChannelNetworkingCommunity


  • 6.  Re: enc_out errors on Inter-site SAN connections

    Posted 02-04-2015 03:32 AM

    Alexey,

     

    here is some additional info that we collected when we were testing the end-to-end circuit using porttest and loopback connectors.( I have already sent this to Antonio by email)

     

    We have done some work testing the connections using porttest and breaking into the circuit and inserting loopback connectors.

    We have interpreted the results as indicating that the cause of the enc_out errors lies within the DWDM Circuit between both sites.

    I have included our results below. The production location is called Cathcart and the remote location is called Kirkintilloch.

    I would be interested to know if you concur with our conclusion the the DWDM circuit is the cause of the problem.

    Test Results

    Here's a quick summary of the test carried out on the BT circuit and the results observed during the tests

    Test 1:

    Port 2/27 persistently disabled on switch FSWCATD51 and a loopback connector plugged into the fibre cable that connects to port 2/27 on switch FSWCATD51 (i.e. the farthest end of the link from Kirkintilloch)
              
         All error counters were cleared on switch FSWKIRKD53 prior to executing the test

         The command 'porttest -ports 2/27' was executed on FSWKIRKD53. This command sends 20 test frames to the port and an extract of the error counters for the port (the port index is 91) is displayed below
            
        FSWKIRKD53:e400022> porterrshow| grep 91
     91:   20     20      0      0      0      0      0      0     15      0      0      0      0      0      0      0      0      0

    FSWKIRKD53:e400022> porterrshow| grep 91
     91:   20     20      0      0      0      0      0      0     21      0      0      0      0      0      0      0      0      0

    As can be seen from the 9th column along, enc_out errors were seen to increment as a result of executing the 'porttest' command

    'porttestshow' was also executed and this reported that the test had passed with no errors as below

    FSWKIRKD53:e400022> porttestshow -ports 2/27
    Port 91 : PASS
    PortType: LOOPBACK PORT            PortState: TEST DONE
    PortInternalState: INIT                    PortTypeToTest: NO_TEST
    Pattern: 0xb            Seed: 0xaa           UserDelay: 10
    TotalIteration: 20                 CurrentIteration: 20
    TotalFail: 0                       ConsecutiveFail: 0
    StartTime: Mon Feb 02 13:29:00 2015
    StopTime:  Mon Feb 02 13:29:06 2015
    Timeout: 0                         ErrorCode: 0

    Test 2:

    All error counters were cleared on switch FSWKIRKD53 and port 2/27 was disabled.

    The loopback connecter was plugged into the sfp in the ADVA patch panel in Cathcart and port 2/27 was enabled on switch FSWKIRKD53.

    The 'porttest -ports 2/27' command was executed on switch FSWKIRKD53 an again, enc_out errors were observed to increment on switch FSWKIRKD53

    FSWKIRKD53:e400022> porterrshow| grep 91
     91:   24     24      0      0      0      0      0      0     22      0      0      0      0      0      0      0      0      0

    FSWKIRKD53:e400022> porterrshow| grep 91
     91:   24     24      0      0      0      0      0      0     25      0      0      0      0      0      0      0      0      0

    The 'porttetsshow' command once again indicated that the test passed with no errors

    FSWKIRKD53:e400022> porttestshow -ports 2/27
    Port 91 : PASS
    PortType: LOOPBACK PORT            PortState: TEST DONE
    PortInternalState: INIT                    PortTypeToTest: NO_TEST
    Pattern: 0xb            Seed: 0xaa           UserDelay: 10
    TotalIteration: 20                 CurrentIteration: 20
    TotalFail: 0                       ConsecutiveFail: 0
    StartTime: Mon Feb 02 13:36:34 2015
    StopTime:  Mon Feb 02 13:36:42 2015
    Timeout: 0                         ErrorCode: 0

    Test 3:

    All error counters were cleared on switch  FSWKIRKD53. Port 2/27 was disabled on switch FSWKIRKD53.

    The loopback connector was plugged into the 'attenuating' fibre cable that connects to the sfp in the ADVA patch panel in Kirkintilloch and port 2/27 was enabled on switch FSWKIRKD53.

    The 'porttest -ports 2/27' was executed on switch FSWKIRKD53. No 'enc_out' errors were observed on this occasion, as per the extract below

    FSWKIRKD53:e400022> porterrshow | grep 91
    91:   13     13      0      0      0      0      0      0      0      0      0      0      0      0      0      0      0      0

    FSWKIRKD53:e400022> porterrshow | grep 91
    91:   24     24      0      0      0      0      0      0      0      0      0      0      0      0      0      0      0      0

    The 'porttestshow' output indicated the test had also passed.

    FSWKIRKD53:e400022> porttestshow -ports 2/27
    Port 91 : PASS
    PortType: LOOPBACK PORT            PortState: TEST DONE
    PortInternalState: INIT                    PortTypeToTest: NO_TEST
    Pattern: 0xb            Seed: 0xaa           UserDelay: 10
    TotalIteration: 20                 CurrentIteration: 20
    TotalFail: 0                       ConsecutiveFail: 0
    StartTime: Tue Feb 03 10:20:15 2015
    StopTime:  Tue Feb 03 10:20:23 2015
    Timeout: 0                         ErrorCode: 0

    Given that no errors were observed testing from just before the ADVA kit back to the switch port in Kirkintilloch and errors were observed when testing from the ADVA kit in Cathcart back to the switch port in Kirkintilloch, this would seem to suggest that the issue lies somewhere in the BT ADVA circuit between the sites.

    thanks for you help

    best regards

     

    Tom


    #BrocadeFibreChannelNetworkingCommunity


  • 7.  Re: enc_out errors on Inter-site SAN connections

    Posted 02-06-2015 05:46 AM
    i totally agree, in this case you have some kind of error condition between the ADVA devices. interesting is why the errors only appear between the frames. i'd think that this is something logical rather than hardware.

    regarding dport tests that show success while error counters increase: we've got a case like that, it was ~6 months ago, and the outcome was that brocade confirmed some defects in the dport tests code, committed to fix them in 7.2. we are still on 7.1 and couldn't confirm if this was really fixed or not.
    #BrocadeFibreChannelNetworkingCommunity