vCenter

 View Only
Expand all | Collapse all

Hyperic Availabillity metric alert

  • 1.  Hyperic Availabillity metric alert

    Posted Mar 07, 2013 05:33 AM
    Hi,
    I have a problem with Hyperic event alerts on Availability. 
    I want to simply alert and email when a server becomes unavailable.  I am using the standard Availability metric that Hyperic collects, and the alert I setup triggers on event that the Availability is equal to 0%.  I am getting many false positives where I am alerted that availability is 0%, but when I look there are no problems on the monitored server, and even the Availability graph shows no evidence of 0% Availability. 
    I have tried to configure the alert to only trigger when the 0% event occurs 20 times in 1 hour, but even that still falsely triggers the alert sometimes.  I have tried this in different environments and have the same issue everywhere.
    Is there an easier way to detect when a server has gone down and/or become unavailable?

    Thanks



  • 2.  RE: Hyperic Availabillity metric alert
    Best Answer

    Posted Mar 07, 2013 03:34 PM

    Hi,

    I have a problem with Hyperic event alerts on Availability. 

    I want to simply alert and email when a server becomes unavailable.  I am using the standard Availability metric that Hyperic collects, and the alert I setup triggers on event that the Availability is equal to 0%.  I am getting many false positives where I am alerted that availability is 0%, but when I look there are no problems on the monitored server, and even the Availability graph shows no evidence of 0% Availability. 

    check the time synchronisation between the server and the agent.

    Hyperic is very (IMHO overly) sensitive to badly synchronised clocks, and false alerts on Availability metrics without any evidence for an outage on the server are the most frequent symptom.



  • 3.  RE: Hyperic Availabillity metric alert

    Posted Mar 08, 2013 04:33 PM

    True about clocks.
    you can check that here - administration -> HQ Health -> Agents (tab) -> Time Offset (column)

    also, you can enable debug logs on agent and check the error messages it throws up.



  • 4.  RE: Hyperic Availabillity metric alert

    Posted Mar 11, 2013 10:59 PM

    Thanks for the responses.  But my Hyperic server is, and needs to be, in a different time zone than the monitored client servers.  All these servers use the ntpd time sync daemon.

    If the availbility metric will not work in this scenario, is there a different way of alerting when a server is not reachable? For Example; the server has crashed or the Hyperic client software has crashed or that network has gone down.

    Thanks



  • 5.  RE: Hyperic Availabillity metric alert

    Posted Mar 14, 2013 01:03 AM

    Hi Peter,

    What sort of time offsets should I be looking for?



  • 6.  RE: Hyperic Availabillity metric alert

    Posted Mar 12, 2013 07:14 AM

    Hi

    That sounds starnge as this simple scenario works...

    Please provide me with the following details:

    - Which Hyperic version are you using

    - Is it EE or OS?

    - I understand that your Hyperic server is in time zone X while your monitored applicaitons are in time zone Y, correct?

    - All are using the same NTP server?

    - Which O/S is the Hyperic server installed on? and which O/S the monitored apps are?

    Thanks

    Yoav, Hyperic QE



  • 7.  RE: Hyperic Availabillity metric alert

    Posted Mar 12, 2013 01:13 PM

    Hi

    if possible please provide 2 screenshots

    1) the resource you are creating alert for

    Image:1.jpg

    2) the alert defenition page

    Image:2.jpg

    I attached similar screen shots from my server so you can look at what to capture

    Thanks

    Nimrod



  • 8.  RE: Hyperic Availabillity metric alert

    Posted Mar 12, 2013 01:15 PM

    Hi


    if possible please provide 3 screenshots

    1) the resource you are creating alert for

    2) the alert defenition page

    3)the triggered alert screen

    I attached similar screen shots from my server so you can look at what to capture

    Thanks

    Nimrod



  • 9.  RE: Hyperic Availabillity metric alert

    Posted Mar 14, 2013 03:43 AM

    Timezones dont matter here.
    I believe internally, Hyperic stores all timestamps in UTC format.

    In that offset column, if you see values greater than 1 min (60000 ms), then you should be worried.

    I try to keep them under 500ms. Anything greater than that is an indication that something is/or is about to go wrong.

    More info can be found in official Hyperic documentation - http://pubs.vmware.com/vfabric5/index.jsp?topic=/com.vmware.vfabric.hyperic.4.6/Troubleshoot_Agent_and_Server_Problems.html

    One more thing - as per my observations, windows servers (especially Win 2003) tend to have greater offsets.

    Installing and configuring NTP/Win Time Service usually solves all time-sync problems.

    Here is a sample of offset values you should see -



  • 10.  RE: Hyperic Availabillity metric alert

    Posted Mar 14, 2013 03:47 AM

    Thank you Amurty, my hyperic clients/servers were not in sync. I've fixed the problem by configuring ntp on the servers/clients.

    My time offsets are much more reasonable now



  • 11.  RE: Hyperic Availabillity metric alert

    Posted Apr 15, 2013 12:26 AM

    Thanks Everyone,

    It seems I had the time sync issue.  I had not installed NTP on the actual Hyperic server, I only had it installed on the clients.  All good now.