So I have a quesiton. We were having issues with both the cdm probe and the ntevl probe detecting if a machine was rebooted. UIM support stated that yes there were 'defects' in each probe and we are still not 100% sure that we can rely on either of these probes to detect reboot scenarios.
So yesterday a request introduced me to the QoS_Computer_Uptime metric that is collected by the CDM probe. So that got me thinking that we can use this value to determine if a machine has been rebooted. Since we have cdm deployed everywhere and are collecting this value, I wanted to know if or how eactly would I setup a watcher to look at ALL new QOS_COMPUTER_UPTIME entries being added to the database, for each robot and have the portion of UIM (assuming SLM manager) generate an alert if the value is < 300s. That means the machine was rebooted w/in the last 5 minutes and this would accurately give us the indicaiton that the machine rebooted.
I don't know the SLM tool very well. I primary use it to check QoS metrics and delete QoS data mostly.
I was playing around in it all morning and can't figure out if this is possible. Is there a way to define a global QoS watcher on this one specific metric and setup a rule that says, if QoS_Computer_Uptime < 300 then Trigger an alert.
I was trying to set this up using the old SLM fat client, but when I was looking at the help some of the examples don't have those features any longer or those menu entries.
1. Is this possible.
2. If yes can this be setup at a Domain Level and not on each and every specific Hub Level/Robot Level.
3. If Yes how and where do I do this exactly? Anyone have any tutorials on setting something like this up?
4. What exactly is this task refered to exactly? Would this be considered a SLA or a QoS Monitor or ???
We use ntperf to collect the systemuptime (among other values).
We use it only for reporting, but the probe has an alarm option.
You could enter a small value, which would be an indicator that the computer has rebooted.
CDM also has the option to alarm on reboot this works for us. we also have a low hub check interval which also helps detect reboots
Thanks for the suggestions everyone.