Brocade Fibre Channel Networking Community

Expand all | Collapse all

Three site: A.DCX8510-8 B.DCX8510-8 C.5100 can LISL between A&B be connected via C for fault toleran

Jump to Best Answer
  • 1.  Three site: A.DCX8510-8 B.DCX8510-8 C.5100 can LISL between A&B be connected via C for fault toleran

    Posted 02-03-2016 02:59 AM

    Multi-location long distance fabrics

    Setup
    Three data centres
    2 primary sites (A & B) each with an IBM SAN768B-2 (DCX 8510-8)
    1 secondary site ( C ) with an IBM SAN40B-4 (5100)
    DWDM connectivity between all sites

    LISL configured between A & B (AtoB=74km)

    Question
    Can the LISL also be connected via site C (AtoC=60km BtoC=13km) to provide a diverse path for fault tolerance?


    #san
    #fibrechannel
    #BrocadeFibreChannelNetworkingCommunity
    #storagenetworking


  • 2.  Re: Three site: A.DCX8510-8 B.DCX8510-8 C.5100 can LISL between A&B be connected via C for fault toleran

    Posted 02-04-2016 03:33 PM
    Not sure why you call it LISL... But yes, I would definitely connect all three sites together. If A-B will fail, traffic will start going A-C-B. Moreover, if all links are good and stable, and given the fact that 5100 is a good single ASIC switch, you could decrease the cost of A-C and A-B links (or increase cost of A-B) so that A-B = A-C + C-B - this way your traffic will use both paths at once!


    #BrocadeFibreChannelNetworkingCommunity


  • 3.  Re: Three site: A.DCX8510-8 B.DCX8510-8 C.5100 can LISL between A&B be connected via C for fault toleran

    Posted 02-08-2016 05:56 AM
      |   view attached

    Hi Alexey, thank you for the response.

     

    I think I over simplified the solution in my original post.

    Currently the base switches in the two DCX switches form an XISL link. Virtual switches at both sites are enabled for XISL, and Logical ISL (LISL) connections exist between virtual switches with the same fid at each site.

    I have attached a drawing to try to illustrate the solution I am trying to achieve.

    If we created a base switch on the 5100 and connected it to the base switches at A & B would the path still fail over from AviaCtoB if AtoB failed?

     

     

     


    #BrocadeFibreChannelNetworkingCommunity


  • 4.  Re: Three site: A.DCX8510-8 B.DCX8510-8 C.5100 can LISL between A&B be connected via C for fault toleran

    Posted 02-08-2016 02:03 PM
    Andy, that's a good clarification, and the picture looks different now. However, the answer is still the same. Yes, all the base FID traffic (including XISL and LISL) will reroute to A-C-B if the link A-B will fail. Moreover, if your C site is just a transit point and doesn't have any non-base FID devices, then you don't have to partition that switch at all. You can just leave it VF-disabled and it will happily join the base FID99 and keep running all the required traffic between the sites. However, configuring it as VF-enabled is a good idea, because conversion from non-VF to VF requires reboot.
    #BrocadeFibreChannelNetworkingCommunity


  • 5.  Re: Three site: A.DCX8510-8 B.DCX8510-8 C.5100 can LISL between A&B be connected via C for fault toleran

    Posted 02-10-2016 06:15 AM
      |   view attached

    Thanks Alexey, that is looking promising. The final piece of our puzzle is location of resources.

    Resources at A&B need to communicate with one another and additionally resources at A&B each need to be able to access a resource at C. (to support a distributed cluster environment with Quorum at site C)

    Assumption:

    Based on what you have told me so far, due to link cost A&B resources will communicate directly with one another and for the same reason A will directly access C as will B.

    Questions:

    In the event of the loss of the line between A&C access to the resource at C from A will automatically route via B?

    Would the selection of the alternate path be instantaneous?

     

    I have updated my drawing (attached) to hopefully illustrate clearly.

     


    #BrocadeFibreChannelNetworkingCommunity


  • 6.  Re: Three site: A.DCX8510-8 B.DCX8510-8 C.5100 can LISL between A&B be connected via C for fault toleran

    Posted 02-10-2016 02:47 PM
    Andy, that's not difficult. Every time when number of E-Ports changes in the fabric (i.e. when ISL is going online or offline) FSPF recalculates the routes between domains. So when ANY of the links in your triangle disappears, FSPF will figure out that the traffic for the now-missing link should be rerouted over two hop connection via the third location. But when the lost link recovers, FSPF immediately brings back all the single hop routes.

    So, in short - yes, it will reroute, and yes, it will be almost instantaneous. Almost - because it will still require fabric rebuild. Also bear in mind that the number of frames - those that are currently "in flight" - will be lost. This will trigger some recovery on the upper layer, most likely - SCSI timeouts and retries. But that is unavoidable in the long distance implementations.

    I'm just curious what kind of cluster with quorum are you deploying?
    #BrocadeFibreChannelNetworkingCommunity


  • 7.  Re: Three site: A.DCX8510-8 B.DCX8510-8 C.5100 can LISL between A&B be connected via C for fault toleran

    Posted 02-11-2016 08:00 AM

    This is an IBM SVC Stretched cluster implementation supporting VMWare Metro storage cluster.

     

    In the event of a path loss in the above environment we would need to be certain that the fabric rebuild will happen quickly. A fabric rebuild suggests that all fabrics will be affected and that access from a node to the quorum as well as node to node will be disrupted for a short while? How long might this process take?

    Any significant delay in the rebuild would mean cluster nodes not having access to each other and loss of access to the third site for quorum, they would offline to protect against split brain and all hosts would be disconnected (the complete opposite of the requirement for the HA VMWare solution) until links once again re-established.


    #BrocadeFibreChannelNetworkingCommunity


  • 8.  Re: Three site: A.DCX8510-8 B.DCX8510-8 C.5100 can LISL between A&B be connected via C for fault toleran
    Best Answer

    Posted 02-12-2016 01:37 PM
    My experience shows that fabric rebuild takes up to 10 seconds, usually 2-3 seconds for a small three-domain fabric like yours. You are right about the impact to all the devices, including those that do not have any access to the remote sites. In a config like yours, I'd not use any stretched FIDs with XISLs. I'd rather create an FCR backbone in the base FID and leave all other FIDs unique and separate. All the devices accessing the remote sites will be put in LSAN zones. Should there be any turbulence in the long distance links, the rebuild will only happen in the FCR backbone. All the affected LSAN zoned devices will receive RSCNs about their partner devices going offline, but the edge fabrics will stay stable, so at least all the local traffic will keep running without any trouble.
    #BrocadeFibreChannelNetworkingCommunity


  • 9.  Re: Three site: A.DCX8510-8 B.DCX8510-8 C.5100 can LISL between A&B be connected via C for fault toleran

    Posted 02-17-2016 02:25 AM

    Thanks Alexey, plenty foor me to work with.


    #BrocadeFibreChannelNetworkingCommunity