DX Infrastructure Manager

Expand all | Collapse all

Server Hung State -What is the monitoring Solution ?

  • 1.  Server Hung State -What is the monitoring Solution ?

    Posted 04-03-2018 11:28 AM

    Hi All,

     

     What is solution for monitoring server hung state proactive ? (Mostly windows )Still now i cant find any solutions .Server is responding to icmp and all services are running fine and there is no spike in performance too . How do we overcome this ? Request to share the experience on this



  • 2.  Re: Server Hung State -What is the monitoring Solution ?

    Posted 04-03-2018 11:38 AM

    What do you mean by "hung state"? What is the observable state? What happens? If services are still up, that means service manager is still responding, and ICMP up means the network interface is still responding...



  • 3.  Re: Server Hung State -What is the monitoring Solution ?

    Posted 04-03-2018 11:43 AM

    Sometimes the windows will be in loading state ,but user cant login the server. 



  • 4.  Re: Server Hung State -What is the monitoring Solution ?

    Posted 04-04-2018 02:04 AM

    i am not perfect and how about net_connect based port monitoring over tcp 3389 for windows.



  • 5.  Re: Server Hung State -What is the monitoring Solution ?

    Posted 04-04-2018 03:15 AM

    No Yu, It doesn't help because ping and services will be active only.



  • 6.  Re: Server Hung State -What is the monitoring Solution ?

    Posted 04-04-2018 07:19 AM

    There might be something in the event log, so that is worth monitoring with ntevl.

    Terminal service/remote desktop service is running with ntservices?

    C: drive disk usage with cdm to check for zero space.

    When you find what goes wrong when this happens, you might be in a better position to setup the monitoring.

    Hope this helps



  • 7.  Re: Server Hung State -What is the monitoring Solution ?

    Posted 04-04-2018 08:48 AM

    The approach we take is to monitor for the lack of something expected happening in a QoS table. 

     

    Essentially it's a fancy heartbeat and there's a couple ways of going about it.

     

    The approach is to use logmon (or one of the other probes) to run a script/test that exercises one of the functions that your end user would perceive as "hanging". Change in log file size, disk space free measure, URL get, etc. Or even just a batch file that posts a QoS record periodically.

     

    Then on your central hub, have a scheduled job query that QoS table for robots that haven't reported a QoS update recently.

     

    It's not especially elegant but it does work.

     

    -Garin 



  • 8.  Re: Server Hung State -What is the monitoring Solution ?

    Posted 04-04-2018 10:44 AM

    Thanks Garin for Suggestion !!  Again we need to do some custom solution where it must be addressed as default capability .



  • 9.  Re: Server Hung State -What is the monitoring Solution ?

    Posted 04-04-2018 10:38 AM

    Another way could be to have a probe which issues a certain callback on a couple of probes installed on the robot, to see if you get a result. If you don't, that might mean it's frozen.



  • 10.  Re: Server Hung State -What is the monitoring Solution ?

    Posted 04-16-2018 09:52 AM

    Hi Issac08 Praveen Venkatesan, Is your question answered?