Brocade Fibre Channel Networking Community

Expand all | Collapse all

Bottleneckmon not alerting

  • 1.  Bottleneckmon not alerting

    Posted 01-11-2013 08:32 AM

    Hi All,

    I have 6 switches that I have recently upgraded to 6.4.3c (2x 5100's 4x 300's) in order to help troubleshot some performance issues in our SAN by using bottleneckmon to identify any slow drain devices. I have enabled bottleneckmon

    bottleneckmon –status
    Bottleneck detection – Enabled
    ==============================

    Switch-wide alerting parameters:
    ============================
    Alerts                         - Yes
    Latency threshold for alert    - 0.100
    Congestion threshold for alert - 0.800
    Averaging time for alert       - 300 seconds
    Quiet time for alert           - 300 seconds

    and if I run ‘bottleneckmon --show -span 10800’ I can see that events are happening  but nothing is being logged in the syslog. If I run ‘errdump’ I only see ‘, 4736, FID 128, INFO,  The last device change happened at’ with the errors going back over the last few month. There is no other logging on these switches, no snmp forwards and no syslogging to a server.

    Where have my alerts gone? Is the system log broken as it is only reporting alerts?


    Regards,
    Lee


    #BrocadeFibreChannelNetworkingCommunity


  • 2.  Re: Bottleneckmon not alerting

    Posted 01-14-2013 03:41 AM

    Hi Lee,

    and you really see something else than 0 "Number of bottlenecked ports" in the list? This is an example output from the "bottleneck-detection-best-practices-guide" available from MyBrocade, that shows authentic bottlenecked ports:

    switch:admin> bottleneckmon --show

    ==================================================================

    Fri Feb 26 22:00:00 UTC 2010

    ==================================================================

    List of bottlenecked ports in most recent interval:

    13 16

    ==================================================================

                                                      Number of

    From           To                            bottlenecked ports

    ==================================================================

    Feb 26 21:59:50 Feb 26 22:00:00 2

    Feb 26 21:59:40 Feb 26 21:59:50 0

    Feb 26 21:59:30 Feb 26 21:59:40 0

    Feb 26 21:59:20 Feb 26 21:59:30 0

    Feb 26 21:59:10 Feb 26 21:59:20 0

    Feb 26 21:59:00 Feb 26 21:59:10 0

    Feb 26 21:58:50 Feb 26 21:59:00 0

    Feb 26 21:58:40 Feb 26 21:58:50 0

    Feb 26 21:58:30 Feb 26 21:58:40 0

    Feb 26 21:58:20 Feb 26 21:58:30 2

    Feb 26 21:58:10 Feb 26 21:58:20 3

    Feb 26 21:58:00 Feb 26 21:58:10 3

    Notice that the guide also recommends to change the default settings for earlier warnings. Verify what ports that are logging out/in from the fabric with "fabriclog -s" (snmp-1008 messages).

    --filiph


    #BrocadeFibreChannelNetworkingCommunity


  • 3.  Re: Bottleneckmon not alerting

    Posted 01-14-2013 04:03 AM

    Hi filiph,

    Sorry but I might not have explained myself correctly. I have read the "Bottleneck Detection Best Practice Gudie". I am seeing bottleneck ports detected when I run "bottleneckmon --show" but from the Best Practice Guide I understood that a more detailed error would be logged in the system log specifing which port the error was against so I could itentify the problem device and investigate further.

    Nothing but SNMP-1008 errors are shown when I run "errdump", so I was questioning if my systemlogs are broken or if I configured something wrong.


    #BrocadeFibreChannelNetworkingCommunity


  • 4.  Re: Bottleneckmon not alerting

    Posted 01-15-2013 04:45 AM

    Hi,

    if you have bottleneck alarms you should have corresponding AN-10xx events reported in errdump, for example:

    2012/12/27-17:33:19, , 1199, SLOT 7 | FID 128, WARNING, SW1, Latency bottleneck at slot 1, port 11. 13.33 percent of last 30 seconds were affected. Avg. time b/w transmits 55.2090 us.

    2012/12/27-17:33:49, , 1200, SLOT 7 | FID 128, WARNING, SW1, Slot 1, port 11 has Latency bottleneck cleared.

    And when you say events are occurring, I assume they are not just entries with zero ports affected? This is an example from a healthy switch:

    switch:admin> bottleneckmon --show

    ==================================================================

            Tue Jan 15 13:36:37 CET 2013

    ==================================================================

    List of bottlenecked ports in most recent interval:

    None

    ==================================================================

                                                    Number of

    From                    To                      bottlenecked ports

    ==================================================================

    Jan 15 13:36:27         Jan 15 13:36:37           0

    Jan 15 13:36:17         Jan 15 13:36:27           0

    ... removed to save space...

    Jan 15 13:31:47         Jan 15 13:31:57           0

    Jan 15 13:31:37         Jan 15 13:31:47           0

    --filiph


    #BrocadeFibreChannelNetworkingCommunity


  • 5.  Re: Bottleneckmon not alerting

    Posted 01-16-2013 12:56 AM

    Hi,

    I am seeing ports reported as bottlenecked when I do a manual search as below.

    admin> bottleneckmon --show -interval 60 -span 3600 15

    =============================================================

    Tue Jan 15 16:21:36 UTC 2013

    =============================================================

    Percentage of From To affected secs

    =============================================================

    Jan 15 16:07:36Jan 15 16:08:36 0.00%

    Jan 15 16:06:36Jan 15 16:07:36 0.00%

    Jan 15 16:05:36Jan 15 16:06:36 0.00%

    Jan 15 16:04:36Jan 15 16:05:36 35.00%

    Jan 15 16:03:36Jan 15 16:04:36 0.00%

    Jan 15 16:02:36Jan 15 16:03:36 16.67%

    Jan 15 16:01:36Jan 15 16:02:36 0.00%

    Jan 15 16:00:36Jan 15 16:01:36 15.00%

    Jan 15 15:59:36Jan 15 16:00:36 3.33%

    Jan 15 15:58:36Jan 15 15:59:36 0.00%

    Jan 15 15:57:36Jan 15 15:58:36 0.00%

    but no AN-10xx events in the errdump only SNMP-1008 errors

    Rgds,

    Lee


    #BrocadeFibreChannelNetworkingCommunity


  • 6.  Re: Bottleneckmon not alerting

    Posted 01-17-2013 12:33 AM

    Hi

    with an interval of 60 seconds you experienced <35% affected seconds. But remember that the default setting is to log an alert when affected seconds hit 10% for a 5 minute interval. If you configured a custom latency threshold of 10% with a 30 or 60 second interval, you should have triggered the threshold and received alerts. Have you tried a more aggressive latency threshold?

    --filiph


    #BrocadeFibreChannelNetworkingCommunity


  • 7.  Re: Bottleneckmon not alerting

    Posted 01-22-2013 03:26 AM


    Hi,

    Did you tried: bottleneckmon --enable -alert ? if -alert is not included, no events will be logged to errdump.

    Rgds


    #BrocadeFibreChannelNetworkingCommunity