Brocade Fibre Channel Networking Community

Expand all | Collapse all

increased response time while neighbour port is saturated

  • 1.  increased response time while neighbour port is saturated

    Posted 03-08-2019 06:07 AM

    I would like to understand an impact by the traffic of one server in our environement on another to be able to find an appropriate solution to remediate.

     

    What we have:

    Two servers (in our focus) are connected to an edge switch.


    Server1's application is very sensitive to response time. It should be below 3ms. It uses a flash storage array connected to the core.

    Server1 is connected with 3 ports to edge DCX (3 ports to 3 different slots)

    Server2 is connected with one port to another blade. It starts it's r/w activity to another storage array connected somewhere beyond the core.

    Server2 starts to saturate all the port throughput available (4Gb/s) and that is when server1 sees a higher latency, 4-5ms.

     

    There are 8 trunk groups 4 ISL each to the core switch. each ISL is 16Gb/s. They are utilised on 10% maximum.

    tim_txcrd_z counter ticks around 7000 times on one of server1 ports during 30 minutes interval, 5000 times on another and 4000 times on the third port. smth like this. This does not seem much to affect response time so significantly.

     

    On 4 ISL I see an increase of tim_txcrd_z for 80000 but I can not determine if these ISL are used for transmitting server1 frames or not.

     

    So I wonder what causes the increase in response time of server1 ?

    Shortage of any resources or what exactly ? How could I determine it?

     


    #BrocadeFibreChannelNetworkingCommunity


  • 2.  Re: increased response time while neighbour port is saturated

    Posted 03-08-2019 08:42 AM
    Hello,

    do you see any c3discards on the switchez ?
    #BrocadeFibreChannelNetworkingCommunity


  • 3.  Re: increased response time while neighbour port is saturated

    Posted 03-11-2019 02:49 AM
    Marian, no discards. It's a way far from discards. It's just an increase in few milliseconds.
    #BrocadeFibreChannelNetworkingCommunity


  • 4.  Re: increased response time while neighbour port is saturated

    Posted 03-11-2019 07:37 AM

    In the tim_txcrd_z output have you checked the virtual circuits (VCs) within the ISL? Is there a specific VC that is clocking most of the wait counts?


    #BrocadeFibreChannelNetworkingCommunity


  • 5.  Re: increased response time while neighbour port is saturated

    Posted 03-13-2019 06:57 AM


  • 6.  Re: increased response time while neighbour port is saturated

    Posted 03-20-2019 08:39 PM

    1、How long from statsclear

    2、How long Server2 saturate all the port throughput available last after statsclear

    3、Have you checked all Trunk-masters of the 8 trunks for tim_txcrd_z counter ?

     

    Point from myself :

    1、increase of tim_txcrd_z for 80000 mains that the total latency cost is 80000*2.5us=200ms, so the time it last is important

    2、Speed of server2 is 4Gbp/s, witch is a lower-performance device compare to the 16G ISL. So it would make some affection on latency on ISL.


    #BrocadeFibreChannelNetworkingCommunity


  • 7.  Re: increased response time while neighbour port is saturated

    Posted 03-26-2019 04:37 AM
    Yulong Lu, thank you for your interest in this case.

    1. The counters provided by me for tim_txcrd_z are taken from OnCommand Insight monitoring software, they are the increments since last poll.
    So it does not matter when the counters were reset last time. just consider that they were incremented by these values during half an hour.
    2. Around 15 minutes
    3. Yes, I have checked tim_txcrd_z on all of them. Only 1 trunk group has counters on such a high level, all others are negligible.

    ===
    >>increase of tim_txcrd_z for 80000 mains that the total latency cost is 80000*2.5us=200ms, so the time it last is important

    80000 is during 15 minutes. In 15 minutes there are 900000ms. So 200 ms during 900000ms is nothing. To see an increase in response time for 1ms for every IO we need a counter to be increased by 900000/2.5us = 360 000 000
    >> So it would make some affection on latency on ISL.
    Here I do not agree. It should not affect the speed of frames on ISL. It was discussed in this topic https://community.broadcom.com/t5/Fibre-Channel-SAN-Forums/frame-speed-in-a-fabric/m-p/99278#M29604
    #BrocadeFibreChannelNetworkingCommunity