DX NetOps

  • 1.  Dual CPU alarm using Watches

    Posted Mar 08, 2017 05:56 AM

    Hi

    We are monitoring the CPU utilisation alarms with the default threshold of 85% as critical alarm on all windows servers (using Component  details-Threshold and Watches-Thresholds) .  Now one of our clients wants to add 1 more threshold of 75% as major alarm along with the existing 85% of critical alarms for a critical server.

    We know it can be done with watches but we are facing some problems while doing it on watches.

    We are putting the expressions as NRM_DeviceCPUUtilization.# and threshold as 75 on watch. But here the problem is this server has total of 32 CPUs and if any one of those CPU value goes above  75 then its triggering the alert, but our client does not want alarm if one of the CPUs value goes above 75, he wants the alarm when average value of all CPUs goes above 75 (that is total value of all CPUs divided by 32) its called high aggregate CPU. Can some one please guide us on how to get it configured in watches?

     

    regards

    Roopesh



  • 2.  Re: Dual CPU alarm using Watches

    Broadcom Employee
    Posted Mar 09, 2017 10:04 AM
      |   view attached

    Maybe someone else has a better idea, but at the very least you should be able to specify each individual instance and divide by 32.

     

    IE.  (NRM_DeviceCPUUtilization.1+ NRM_DeviceCPUUtilization.2 + NRM_DeviceCPUUtilization.3 …..) /32

     

    Cheers

    Jay



  • 3.  Re: Dual CPU alarm using Watches

    Posted Mar 13, 2017 03:38 AM

    Hi Jason,

     

    When we tried to do like that we are getting different value. we are creating the watch with expression as below;

     

    NRM_DeviceCPUUtilization.1 + NRM_DeviceCPUUtilization.2  + NRM_DeviceCPUUtilization.3  + NRM_DeviceCPUUtilization.4  + NRM_DeviceCPUUtilization.5  + NRM_DeviceCPUUtilization.6  + NRM_DeviceCPUUtilization.7  + NRM_DeviceCPUUtilization.8  + NRM_DeviceCPUUtilization.9  + NRM_DeviceCPUUtilization.10  + NRM_DeviceCPUUtilization.11  + NRM_DeviceCPUUtilization.12  + NRM_DeviceCPUUtilization.13  + NRM_DeviceCPUUtilization.14  + NRM_DeviceCPUUtilization.15  + NRM_DeviceCPUUtilization.16  + NRM_DeviceCPUUtilization.17  + NRM_DeviceCPUUtilization.18  + NRM_DeviceCPUUtilization.19  + NRM_DeviceCPUUtilization.20  + NRM_DeviceCPUUtilization.21  + NRM_DeviceCPUUtilization.22  + NRM_DeviceCPUUtilization.23  + NRM_DeviceCPUUtilization.24  + NRM_DeviceCPUUtilization.25  + NRM_DeviceCPUUtilization.26  + NRM_DeviceCPUUtilization.27  + NRM_DeviceCPUUtilization.28  + NRM_DeviceCPUUtilization.29  + NRM_DeviceCPUUtilization.30  + NRM_DeviceCPUUtilization.31  +  NRM_DeviceCPUUtilization.32 / 32  

     

    And after saving it the expression displays as below;

     

    ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( NRM_DeviceCPUUtilization.1 + NRM_DeviceCPUUtilization.2 ) + NRM_DeviceCPUUtilization.3 ) + NRM_DeviceCPUUtilization.4 ) + NRM_DeviceCPUUtilization.5 ) + NRM_DeviceCPUUtilization.6 ) + NRM_DeviceCPUUtilization.7 ) + NRM_DeviceCPUUtilization.8 ) + NRM_DeviceCPUUtilization.9 ) + NRM_DeviceCPUUtilization.10 ) + NRM_DeviceCPUUtilization.11 ) + NRM_DeviceCPUUtilization.12 ) + NRM_DeviceCPUUtilization.13 ) + NRM_DeviceCPUUtilization.14 ) + NRM_DeviceCPUUtilization.15 ) + NRM_DeviceCPUUtilization.16 ) + NRM_DeviceCPUUtilization.17 ) + NRM_DeviceCPUUtilization.18 ) + NRM_DeviceCPUUtilization.19 ) + NRM_DeviceCPUUtilization.20 ) + NRM_DeviceCPUUtilization.21 ) + NRM_DeviceCPUUtilization.22 ) + NRM_DeviceCPUUtilization.23 ) + NRM_DeviceCPUUtilization.24 ) + NRM_DeviceCPUUtilization.25 ) + NRM_DeviceCPUUtilization.26 ) + NRM_DeviceCPUUtilization.27 ) + NRM_DeviceCPUUtilization.28 ) + NRM_DeviceCPUUtilization.29 ) + NRM_DeviceCPUUtilization.30 ) + NRM_DeviceCPUUtilization.31 ) + ( NRM_DeviceCPUUtilization.32 / 32 ) )

     

    But the value of this expression will be different from the NRM_DeviceCPUUtilization (that is Average CPU Utilisation). This watch value is now we can see as 30 but NRM_DeviceCPUUtilization (that is Average CPU Utilisation) is just 1. We dont know from where it is taking the value 30.

     

    Regards

    Roopesh



  • 4.  Re: Dual CPU alarm using Watches
    Best Answer

    Broadcom Employee
    Posted Mar 13, 2017 09:59 AM

    Mathematical order of operations.  You need to have all the addition to happen first and then the division by 32.  In your example, you have NRM_DeviceCPUUtilization.32 / 32 and then add that to the sum of all the other NRM_DeviceCPUUtilization.# values, which results in something that isn't going to be useful at all.  What Jason had above is what you should have:

     

    (NRM_DeviceCPUUtilization.1 + NRM_DeviceCPUUtilization.2  + NRM_DeviceCPUUtilization.3  + NRM_DeviceCPUUtilization.4  + NRM_DeviceCPUUtilization.5  + NRM_DeviceCPUUtilization.6  + NRM_DeviceCPUUtilization.7  + NRM_DeviceCPUUtilization.8  + NRM_DeviceCPUUtilization.9  + NRM_DeviceCPUUtilization.10  + NRM_DeviceCPUUtilization.11  + NRM_DeviceCPUUtilization.12  + NRM_DeviceCPUUtilization.13  + NRM_DeviceCPUUtilization.14  + NRM_DeviceCPUUtilization.15  + NRM_DeviceCPUUtilization.16  + NRM_DeviceCPUUtilization.17  + NRM_DeviceCPUUtilization.18  + NRM_DeviceCPUUtilization.19  + NRM_DeviceCPUUtilization.20  + NRM_DeviceCPUUtilization.21  + NRM_DeviceCPUUtilization.22  + NRM_DeviceCPUUtilization.23  + NRM_DeviceCPUUtilization.24  + NRM_DeviceCPUUtilization.25  + NRM_DeviceCPUUtilization.26  + NRM_DeviceCPUUtilization.27  + NRM_DeviceCPUUtilization.28  + NRM_DeviceCPUUtilization.29  + NRM_DeviceCPUUtilization.30  + NRM_DeviceCPUUtilization.31  +  NRM_DeviceCPUUtilization.32) / 32

     

    The parenthesis ensures that the addition happens before the division.

     

    -Rob