Brocade Fibre Channel Networking Community

Expand all | Collapse all

FCIP and TI Zones - Failure in an FCIP circuit did not lead to Failover

  • 1.  FCIP and TI Zones - Failure in an FCIP circuit did not lead to Failover

    Posted 06-04-2013 02:30 AM

    We have an FCIP configuration between two 7800 switches.

    We do not currently have the Advanced Extension License, so are using two FCIP Circuits and manage the availability via Traffic Isolation zones.

    We experienced a brief failure in one of the FCIP circuits the other day. However, it lasted less than 3 seconds. During this time, the traffic did not failover to a non-dedicated path.

    As such, the failure was seen by the VPLEX Disk arrays which use the FCIP link:


    >> (A2-FC01): In the past minute, 2.20% percent IO failed on I 0x0x500014426049c821FS64x" T 0x0x5000144260614521FS64x" (nportid 0x0e0200) due to IO timeout, there might be faulty hardware in the IO path.
    >> Director{2.1.40.84.0}] RCA: IO failed due to exchanges timed out. There might be faulty hardware on the IO path.
    >> IO exchange timeout threshold exceeded.

    My questions are, at what point will the Traffic Isolation Zone failover to the non-dedicated path? Is there a timeout parameter? Is it tuneable?

    The configuration details are quite simple and are as follows:

    Kernel:     2.6.14.2

    Fabric OS:  v6.4.2a

    Made on:    Mon Jul 18 22:35:02 2011

    Flash:      Sat Apr 6 23:35:08 2013

    BootProm:   1.0.9

    A switchshow output details:

    hl-fcsw-slo-03:admin> switchshow
    switchName:     hl-fcsw-slo-03
    switchType:     83.3
    switchState:    Online
    switchMode:     Native
    switchRole:     Principal
    switchDomain:   3
    switchId:       fffc03
    switchWwn:      10:00:00:05:33:54:a3:7a
    zoning:         ON (ZS_STRETCHED_FABRIC_C)
    switchBeacon:   OFF
    FC Router:      ON
    FC Router BB Fabric ID: 1
    Address Mode:   0

    Index Port Address Media Speed State     Proto
    ==============================================
      0   0   030000   id    N8   No_Light    FC  Disabled (Persistent)
      1   1   030100   id    N8   No_Light    FC  Disabled (Persistent)
      2   2   030200   id    N8   Online      FC  F-Port  50:00:14:42:60:49:c8:20
      3   3   030300   id    N8   Online      FC  F-Port  50:00:14:42:70:49:c8:20
      4   4   030400   --    N8   No_Module   FC  Disabled (Persistent)
      5   5   030500   --    N8   No_Module   FC  Disabled (Persistent)
      6   6   030600   --    N8   No_Module   FC  Disabled (Persistent)
      7   7   030700   --    N8   No_Module   FC  Disabled (Persistent)
      8   8   030800   --    N8   No_Module   FC  Disabled (Persistent)
      9   9   030900   --    N8   No_Module   FC  Disabled (Persistent)
    10  10   030a00   --    N8   No_Module   FC  Disabled (Persistent)
    11  11   030b00   --    N8   No_Module   FC  Disabled (Persistent)
    12  12   030c00   --    N8   No_Module   FC  Disabled (Persistent)
    13  13   030d00   --    N8   No_Module   FC  Disabled (Persistent)
    14  14   030e00   --    N8   No_Module   FC  Disabled (Persistent)
    15  15   030f00   --    N8   No_Module   FC  Disabled (Persistent)
    16  16   031000   --    --   Online      VE  VE-Port  10:00:00:05:33:69:3d:d4 "hl-fcsw-lhr-03"
    17  17   031100   --    --   Online      VE  VE-Port  10:00:00:05:33:69:3d:d4 "hl-fcsw-lhr-03" (downstream)
    18  18   031200   --    --   Offline     VE
    19  19   031300   --    --   Offline     VE
    20  20   031400   --    --   Offline     VE
    21  21   031500   --    --   Offline     VE
    22  22   031600   --    --   Offline     VE
    23  23   031700   --    --   Offline     VE
         ge0  cu    1G   Online    FCIP  Copper
         ge1  cu    1G   Online    FCIP  Copper
         ge2  --    1G   No_Module FCIP  Disabled
         ge3  --    1G   No_Module FCIP  Disabled
         ge4  --    1G   No_Module FCIP  Disabled
         ge5  --    1G   No_Module FCIP  Disabled


    As recommended by EMC/Brocade were using TI Zoning:

    hl-fcsw-slo-03:admin> zone --show
    Defined TI zone configuration:

    TI Zone Name:   VPLEX_WAN_0

    Port List:      13,2; 13,3; 13,16; 3,16; 3,2; 3,3

    Configured Status: Activated / Failover-Enabled
    Enabled Status: Activated / Failover-Enabled

    In an active/failover-enabled configuration.

    On the FCIP Circuit side, we have:

    hl-fcsw-slo-03:admin> portshow fcipcircuit all
    -------------------------------------------------------------------------------
    Tunnel Circuit  OpStatus Flags   Uptime   TxMBps   RxMBps ConnCnt CommRt  Met
    -------------------------------------------------------------------------------
    16     0 ge0     Up      ----s   1d9h9m     1.29     0.20    2    40/40    0
    17     0 ge1     Up      ----s   57d23h     0.00     0.00    1    40/40    0
    -------------------------------------------------------------------------------
      Flags: circuit: s=sack


    hl-fcsw-slo-03:admin> fabricshow
    Switch ID   Worldwide Name           Enet IP Addr    FC IP Addr      Name
    -------------------------------------------------------------------------
      3: fffc03 10:00:00:05:33:54:a3:7a 172.32.90.54    0.0.0.0        >"hl-fcsw-slo-03"
    13: fffc0d 10:00:00:05:33:69:3d:d4 172.31.90.54    0.0.0.0         "hl-fcsw-lhr-03"


    Brief failure in Tunnel 16:

    2013/06/02-23:42:51, , 68, CHASSIS, INFO, MP_7800B, FCIP Tunnel 16 High-Pri QoS UP.
    2013/06/02-23:42:48, , 67, CHASSIS, INFO, MP_7800B, FCIP Tunnel 16 Low-Pri QoS UP.
    2013/06/02-23:42:48, , 66, CHASSIS, INFO, MP_7800B, FCIP Tunnel 16 Med-Pri QoS UP.
    2013/06/02-23:42:48, , 65, CHASSIS, INFO, MP_7800B, FCIP Tunnel 16 UP.
    2013/06/02-23:42:48, , 64, CHASSIS, INFO, MP_7800B, FCIP Tunnel 16 Circuit 0 UP.
    2013/06/02-23:42:45, , 63, CHASSIS, ERROR, MP_7800B, FCIP Tunnel 16 DOWN (Network/Remote/Other).
    2013/06/02-23:42:45, , 62, CHASSIS, ERROR, MP_7800B, FCIP Tunnel 16 Low-Pri QoS DOWN (Internal Close).
    2013/06/02-23:42:45, , 61, CHASSIS, ERROR, MP_7800B, FCIP Tunnel 16 Circuit 0 DOWN (Keepalive Timeout).
    2013/06/02-23:42:45, , 60, CHASSIS, ERROR, MP_7800B, FCIP Tunnel 16 Med-Pri QoS DOWN (Keepalive Timeout).
    2013/06/02-23:42:45, , 59, CHASSIS, ERROR, MP_7800B, FCIP Tunnel 16 High-Pri QoS DOWN (Keepalive Timeout).

    Since, however, circuit 17 was unaffected – I’m curious as to why it wasn’t used? i.e. Why didn’t the TI zone failover to use a non-dedicated path?

    Furthermore, if we go down the route of spending a load of money to get the Advanced Extension license in order to roll the two FCIP circuits into a single tunnel (and therefore dispense with the TI Zones), would this issue be masked from the disk array in future?


    #BrocadeFibreChannelNetworkingCommunity


  • 2.  Re: FCIP and TI Zones - Failure in an FCIP circuit did not lead to Failover

    Posted 06-04-2013 04:26 AM

    Hi grant,

    assuming you have no other traffic than those vplex relationships, I would try to create 2 TI zones with failover disabled. Then you would two dedicated paths :

    13,2; 13,16; 3,16; 3,2

    and

    13,3; 13,17; 3,17; 3,3

    You would use both FCIP circuits all the time, and in case of failure on one of them VPLEX multipathing should be able to keep service available on one path. No more issue with TI zone failover.

    BTW if you have money, I can tell you that FCIP trunking works very fine

    Hope this helps

    --

    david


    #BrocadeFibreChannelNetworkingCommunity


  • 3.  Re: FCIP and TI Zones - Failure in an FCIP circuit did not lead to Failover

    Posted 06-04-2013 04:03 PM

    David,

    Many thanks for your response. I will investigate your suggestion with EMC. I believe without the AE License what we implemented was as per EMC's recommendation. I have a feeling (although I don't have the documentation at my finger tips) that the Version of Firmware we're running on the VPLEX arrays limited us to this configuration. I seem to recall that something to do with Fast Write limited us to shunting traffic down only one tunnel at a time.

    I have recommended the AE License to avoid these issues moving forwards - so hopefully that will get taken up !

    Once again, many thanks.

    Best regards,

    Grant


    #BrocadeFibreChannelNetworkingCommunity