Brocade Fibre Channel Networking Community

Expand all | Collapse all

Class 3 transmit frames discarded due to timeout

  • 1.  Class 3 transmit frames discarded due to timeout

    Posted 12-20-2017 12:07 AM
    Dear all, Looking for help here! One of the ports where our SAN storage is connected is showing c3timeout tx errors. While having understood that during the c3timeout tx, the IO processing is not faster enough, hence the receiving count is more than the transfer, but I want to know whether the Storage connected is the not processing fast enough or the servers zoned to it? Below is the port stat output- --------------- FID128:admin> portstatsshow 18 stat_wtx 7151696221951 4-byte words transmitted stat_wrx 50195120110957 4-byte words received stat_ftx 3724140231 Frames transmitted stat_frx 1118044856 Frames received stat_c2_frx 0 Class 2 frames received stat_c3_frx 1118062508 Class 3 frames received stat_lc_rx 0 Link control frames received stat_mc_rx 0 Multicast frames received stat_mc_to 0 Multicast timeouts stat_mc_tx 0 Multicast frames transmitted tim_rdy_pri 0 Time R_RDY high priority tim_txcrd_z 69884976 Time TX Credit Zero (2.5Us ticks) tim_txcrd_z_vc 0- 3: 0 0 0 0 tim_txcrd_z_vc 4- 7: 69884976 0 0 0 tim_txcrd_z_vc 8-11: 0 0 0 0 tim_txcrd_z_vc 12-15: 0 0 0 0 tim_latency_vc 0- 3: 1 1 1 1 tim_latency_vc 4- 7: 1 1 1 1 tim_latency_vc 8-11: 1 1 1 1 tim_latency_vc 12-15: 1 1 1 1 fec_cor_detected 0 Count of blocks that were corrected by FEC fec_uncor_detected 0 Count of blocks that were left uncorrected by FEC er_enc_in 0 Encoding errors inside of frames er_crc 0 Frames with CRC errors er_trunc 0 Frames shorter than minimum er_toolong 0 Frames longer than maximum er_bad_eof 0 Frames with bad end-of-frame er_enc_out 0 Encoding error outside of frames er_bad_os 0 Invalid ordered set er_pcs_blk 0 PCS block errors er_rx_c3_timeout 0 Class 3 receive frames discarded due to timeout er_tx_c3_timeout 23211 Class 3 transmit frames discarded due to timeout er_unroutable 0 Frames that are unroutable er_unreachable 0 Frames with unreachable destination er_other_discard 1 Other discards er_type1_miss 0 frames with FTB type 1 miss er_type2_miss 0 frames with FTB type 2 miss er_type6_miss 0 frames with FTB type 6 miss er_zone_miss 1 frames with hard zoning miss er_lun_zone_miss 0 frames with LUN zoning miss er_crc_good_eof 0 Crc error with good eof er_inv_arb 0 Invalid ARB er_single_credit_loss 0 Single vcrdy/frame loss on link er_multi_credit_loss 0 Multiple vcrdy/frame loss on link phy_stats_clear_ts 11-18-2017 UTC Sat 11:13:55 Timestamp of phy_port stats clear lgc_stats_clear_ts 11-18-2017 UTC Sat 11:13:55 Timestamp of lgc_port stats clear ---------------
    #BrocadeFibreChannelNetworkingCommunity
    #timeout
    #discard


  • 2.  Re: Class 3 transmit frames discarded due to timeout

    Posted 12-20-2017 12:46 AM

    Hello,

     

    Yes looking like storage port is not able to handle frames from the servers.

    What are the port speed of the servers and storage?

    This might be a Q depth issue, an issue on storage itself or a wrong optimization on it.

     

     


    #BrocadeFibreChannelNetworkingCommunity


  • 3.  Re: Class 3 transmit frames discarded due to timeout

    Posted 12-20-2017 04:17 AM
    Dear Thierry/ all, This storage port which is connected to switch port 18, it is zoned with multiple servers having speed of 8 Gbps as well as 16Gbps. The switch port speed is 16Gbps. Storage port speed is 8Gbps. In last 15 days, the "maximum" data transfer recorded on the storage's port is 750MB/s After running mapsdb --show, below health related output produced:- ------------ 3.2 Rules Affecting Health: =========================== Category(Rule Count)|RepeatCount|Rule Name |Execution Time |Object |Triggered Value(Units)| ------------------------------------------------------------------------------------------------------------------------ Fru Health(4) |2 |defALL_PORTSSFP_STATE_IN |12/20/17 15:23:26|U-Port 77 |IN | | | | |U-Port 77 |IN | |2 |defALL_PORTSSFP_STATE_OUT |12/20/17 15:23:04|U-Port 77 |OUT | | | | |U-Port 77 |OUT | Fabric Performance I|1 |defALL_PORTS_IO_LATENCY_CLE|12/20/17 10:55:45|F-Port 18 |IO_LATENCY_CLEAR | mpact(52) | |AR | | | | |1 |defALL_PORTS_IO_FRAME_LOSS |12/20/17 10:54:45|F-Port 18 |IO_FRAME_LOSS | |1 |defALL_PORTS_IO_LATENCY_CLE|12/19/17 16:04:46|F-Port 18 |IO_LATENCY_CLEAR | | |AR | | | | |1 |defALL_PORTS_IO_FRAME_LOSS |12/19/17 16:03:46|F-Port 18 |IO_FRAME_LOSS | |18 |defALL_PORTS_IO_LATENCY_CLE|12/18/17 11:02:48|F-Port 18 |IO_LATENCY_CLEAR | | |AR | | | | | | | |F-Port 18 |IO_LATENCY_CLEAR | | | | |F-Port 18 |IO_LATENCY_CLEAR | | | | |F-Port 18 |IO_LATENCY_CLEAR | | | | |F-Port 18 |IO_LATENCY_CLEAR | -------------------------- Please could you suggest any further troubleshooting? Thanks.
    #BrocadeFibreChannelNetworkingCommunity


  • 4.  Re: Class 3 transmit frames discarded due to timeout

    Posted 12-20-2017 07:06 AM

    Troubleshooting class 3 discards, and device latency issues is a large discussion and can include many aspects of a SAN. A slow draining device can cause this, a poor performing HBA, or a fan ratio issue can contribute, and ISL oversubscription can also affect latency. Please note that port 18 in the report is the port which is BEING affected, and is not likely the port which is the cause of the problem.

     

    First things to do: run the commands: statsclear; slotstatsclear  on each switch in the fabric. After 24-48 hours run porterrshow on each switch in the fabric. Review those outputs for errors, and advise what you find. Also provide your output of firmwareshow, fabricshow, so that we know what kind of equipment we are looking at.

     

    I would advise you to run the SAN Health report, and gather some info on your connections, and througput. Without a complete picture of the fabric, and all F and E port connections it will be impossible to diagnose. There are also records in the log file which may be useful in determining what ports are causing latency within the fabric. 

     

    Start here:

    https://my.brocade.com/wps/myportal/myb/tools/sanhealth/

     

    You will get a report emailed to you showing the switch connections, and many attributes of the fabric. Once that report is checked, we can proceed with some options. As a first guess, try to find legacy devices which may be running at 4GB and are traversing the fabric via ISL(E_ports). This is a common issue causing class 3 discards, but it is only one possible issue. There are many other things which affect throughput and congestion.


    #BrocadeFibreChannelNetworkingCommunity


  • 5.  Re: Class 3 transmit frames discarded due to timeout

    Posted 08-07-2018 03:31 AM

    Hi,

     

    Still the errors are popping up as follows:-

     

    F-Port 18, Condition=ALL_PORTS(DEV_LATENCY_IMPACT==IO_FRAME_LOSS), Current Value:[ DEV_LATENCY_IMPACT,IO_FRAME_LOSS, (1408 C3TX Timeouts) ], RuleName=defALL_PORTS_IO_FRAME_LOSS, Dashboard Category=Fabric Performance Impact.

     

    ---------------------

     

    Port error shows disc c3 and c3timeout tx value is 154.5k

     

    -----------------------

     

    admin> portstatsshow 18
    stat_wtx 10212000744426 4-byte words transmitted
    stat_wrx 63486130803777 4-byte words received
    stat_ftx 1681883768 Frames transmitted
    stat_frx 1658633119 Frames received
    stat_c2_frx 0 Class 2 frames received
    stat_c3_frx 1658728827 Class 3 frames received
    stat_lc_rx 0 Link control frames received
    stat_mc_rx 0 Multicast frames received
    stat_mc_to 0 Multicast timeouts
    stat_mc_tx 0 Multicast frames transmitted
    tim_rdy_pri 0 Time R_RDY high priority
    tim_txcrd_z 146519310 Time TX Credit Zero (2.5Us ticks)
    tim_txcrd_z_vc 0- 3: 0 0 0 0
    tim_txcrd_z_vc 4- 7: 146519310 0 0 0
    tim_txcrd_z_vc 8-11: 0 0 0 0
    tim_txcrd_z_vc 12-15: 0 0 0 0
    tim_latency_vc 0- 3: 1 1 1 1
    tim_latency_vc 4- 7: 1 1 1 1
    tim_latency_vc 8-11: 1 1 1 1
    tim_latency_vc 12-15: 1 1 1 1

    fec_cor_detected 0 Count of blocks that were corrected by FEC
    fec_uncor_detected 0 Count of blocks that were left uncorrected by FEC
    er_enc_in 0 Encoding errors inside of frames
    er_crc 0 Frames with CRC errors
    er_trunc 0 Frames shorter than minimum
    er_toolong 0 Frames longer than maximum
    er_bad_eof 0 Frames with bad end-of-frame
    er_enc_out 0 Encoding error outside of frames
    er_bad_os 0 Invalid ordered set
    er_pcs_blk 0 PCS block errors
    er_rx_c3_timeout 0 Class 3 receive frames discarded due to timeout
    er_tx_c3_timeout 154586 Class 3 transmit frames discarded due to timeout
    er_unroutable 0 Frames that are unroutable
    er_unreachable 0 Frames with unreachable destination
    er_other_discard 0 Other discards
    er_type1_miss 0 frames with FTB type 1 miss
    er_type2_miss 0 frames with FTB type 2 miss
    er_type6_miss 0 frames with FTB type 6 miss
    er_zone_miss 0 frames with hard zoning miss
    er_lun_zone_miss 0 frames with LUN zoning miss
    er_crc_good_eof 0 Crc error with good eof
    er_inv_arb 0 Invalid ARB
    er_single_credit_loss 0 Single vcrdy/frame loss on link
    er_multi_credit_loss 0 Multiple vcrdy/frame loss on link
    phy_stats_clear_ts 07-06-2018 IST Fri 16:39:55 Timestamp of phy_port stats clear
    lgc_stats_clear_ts 07-06-2018 IST Fri 16:39:55 Timestamp of lgc_port stats clear

    ---------------------------

     

    Can it be a FC cable fault?

     

    Please help!


    #BrocadeFibreChannelNetworkingCommunity


  • 6.  Re: Class 3 transmit frames discarded due to timeout

    Posted 08-07-2018 04:36 AM

    Hi,

    If the cable is wrong you will get enc_out errors.

    If you have 750MB/s on storage port, it looks like that the FE port is overloaded ( I never see greater value that 750 on any 8Gb FC port). Try to remap some huge servers to another pair of storage ports to reduce pressure on storage port.

    I had an similar issue with cluster where the huge database utilised 2 pair of FE ports (both at 750MB ), so the timeouts occured.

    We have solved it with another pair of HBA and different less utiized FE ports.


    #BrocadeFibreChannelNetworkingCommunity


  • 7.  Re: Class 3 transmit frames discarded due to timeout

    Posted 08-07-2018 07:29 AM

    There's also this:

     

    tim_txcrd_z 146519310 Time TX Credit Zero (2.5Us ticks)

     

    You might have a slow drain device in play.


    #BrocadeFibreChannelNetworkingCommunity