DX NetOps

 View Only
  • 1.  How to triage alerts for F5 Active/Standby status changes

    Posted Jun 04, 2020 09:11 AM
    I have modified a F5 OID into the existing F5 verdor cert, the expressions are as screenshot attached here.
    The OID 1.3.6.1.4.1.3375.2.1.14.3.1 has values from 0-4, only 4 means ACTIVE, all others values are considered as standby for me.

    After that, the dashboards are showing the correct Active/Standby status although it only shows the past 300 seconds.

    My question is: how can I set the threshold profile to generate alert whenever F5 status changes(from Active to Standby or Vice Versa) 
    I have set up a bunch CPU/Memory/Interface threshold profiles in the past. They raise events deciding by utilization of utilization percentages or number of packets etc...

    But for this one, when I tried to create a threshold profile, it only gave me options such as "PCT of time in Active" or 
    "PCT of time in Standby", and decide by something such as "equal or more than" or "less than" a pct value. If I set as "equal or more than 100%" to raise an event, I have to set "less than 100%" as the required condition to clear the event. So, I guess I have to create another threshold profile for cases "equal or more than 0%" to raise and  "more than 0%" to clear. (I already feel strange if I config this way)

    So for most cases, F5 status switches happen all the time, normally we just ignore if they are working fine after status changes. But these threshold profile might still sending a lot alerts correct? Is there a decent way to send alert only when it changed the status?

    Thanks!

    Tao





  • 2.  RE: How to triage alerts for F5 Active/Standby status changes

    Broadcom Employee
    Posted Jun 04, 2020 10:58 AM
    Can you share a sample report that only shows the metrics involved? What does the data look like in a simple Trend chart for those metrics, over maybe the Last 4 Hours time frame?

    ------------------------------
    Technical Support Engineer IV
    Broadcom
    ------------------------------



  • 3.  RE: How to triage alerts for F5 Active/Standby status changes

    Posted Jun 04, 2020 11:55 AM
    Thanks a lot Mike! I attached an on-demand report for 8 hours and also a screenshot from a dashboard(those first few pages of this trend chart are all 100% when sort descending so their trend lines are mixed together):




  • 4.  RE: How to triage alerts for F5 Active/Standby status changes
    Best Answer

    Broadcom Employee
    Posted Jun 04, 2020 12:06 PM
    So we're saying it's percent in the metric family, but it's really just saving 0 or 100 on whether it's active or standby.

    So for a threshold, sounds like you want a threshold profile for Standby mode, where window/duration is 300 secs. It will check when % in Active goes below 100, AND % in Standby goes above 0 ??
    The clear would be the opposite, Active is 100, and standby is 0.

    Thresholds are run on the polled data, so each 5 min data the % active and % standby will be 0 and 100 or 100 and 0.


  • 5.  RE: How to triage alerts for F5 Active/Standby status changes

    Broadcom Employee
    Posted Jun 04, 2020 12:21 PM
    Yeah, Jeff's suggestions are where I was headed. That should solve this.

    ------------------------------
    Technical Support Engineer IV
    Broadcom
    ------------------------------



  • 6.  RE: How to triage alerts for F5 Active/Standby status changes

    Posted Jun 04, 2020 12:22 PM
    Edited by Tao Yang Jun 04, 2020 12:23 PM
    Hi Mike and Jeff, yes exactly just 10% or 0%.
    I am wondering if there's a way to compare last 5 min polling value to most recent 5 min polling, if different then raise an event from threshold profile.

    If I define two threshold profiles, one raise when "PCT active time" goes from 100% to 0%(switch  from active to standby), and another handle from 0% to 100%(go active from standby). If nothing wrong(don't have to fix anything after failover normally), these alerts will be sending again and again for each polling cycle and generate tons of noise.

    Any suggestion?


  • 7.  RE: How to triage alerts for F5 Active/Standby status changes

    Broadcom Employee
    Posted Jun 04, 2020 12:26 PM
    That's where you use window/duration of 300.  The same as the polling rate.   If you are polling every 60 secs, then I suggest window/duration of 60.

    When the values change from 1 5-min poll to the next 5-min poll, it will trigger the event.

    Note: when an event has been trigged, it will NOT send new SET event over and over.  It will need to CLEAR event first, before we send a new SET event.


  • 8.  RE: How to triage alerts for F5 Active/Standby status changes

    Posted Jun 04, 2020 12:42 PM
    Again Many Thanks to both Mike and Jeff :)
    Tao