DX Application Performance Management

Field Pack: Script that checks for dropped agents 

10-07-2014 01:15 PM

**************************** What is this? ***************************

Created by: Fayaz Ghiasy, Alex York, Alex Schmid

 

This is a Python script that runs queries agents the CLW to check for dropped agents. It compares a generated CurrentAgentList.txt against a MasterAgentList.txt and will send out an email with the Agents that are not in the CurrentAgentList.txt. MasterAgentList has to be updated manually.

 

***************************** Install instructions ****************************

Install Python version 3+

Edit python script to your environment details

  • CLW command
  • Email to/from
  • SMTP server

Run CLW command in script to generate CurrentAgentList.txt and rename it to MasterAgentList.txt

 

Example usage:

Run daily using Cron or Windows task scheduler to get emails about agents that have been dropped in the past 24 hours.

 

***************************** Support policy ****************************

Field pack - Unsupported

Statistics
0 Favorited
0 Views
1 Files
0 Shares
0 Downloads
Attachment(s)
zip file
DroppedAgents.py.zip   864 B   1 version
Uploaded - 05-29-2019

Tags and Keywords

Comments

11-19-2014 01:53 PM

Fred.K

That was my concern also,which is also the reason why the JS calc was created to address this.

 

The use case is that as agents are load balanced, the ConnectionStatus metric, could in fact, trigger a false-positive on the status of an agent.

With the calculator, it mitigates that as it looks across the cluster to find the agent and reports back a single status value that you can alert on.

11-19-2014 01:26 PM

For multicollector jvm down alert suggest to follow the thread Java Agent Availalbility (up/down) status metric

11-17-2014 01:16 PM

I'm not sure this addressing load balancing, does it?

11-14-2014 03:23 PM

You can do this without a script by creating an alarm based on a particular agent's connection status; then another alarm based on the output values of the first alarm that will alarm if the alarm is not 1. That way when the connection status gets trimmed you will still have an active alarm.

 

The second alarm can be a metric group with all the alarms for the individual agents you want to keep track of. The individual agent alarms should be set with to accept connection information from any collector in your cluster and require all of the connection information to be non-1 in order to alarm (that way you won't be alerted if an agent shifts to another collector).

 

Step by step instructions:
1. Create an metric grouping based on an agent connection value, e.g.:
(.*)\|Custom Metric Process \(Virtual\)\|Custom Metric Agent \(Virtual\) \(COLLECTORPREFIX.@PORTPREFIX.\)\|Agents\|SERVERNAME\|WTGManager\|WTGAgent:ConnectionStatus

2. Create a simple alarm to alert based on a it NOT having a value of 1 in ALL items in the above metric grouping

3. Create a metric grouping to pull in all the agent alarms:
(.*)\|Custom Metric Process \(Virtual\)\|Custom Metric Agent \(Virtual\)\|Alerts\|Bob:AgentConnect_(.*)

4. Create a simple alarm to alert based on ANY not having a value of 1 in the above metric grouping; set actions as desired.

 

While the script above obviously could be more efficient for a large amount of agents to track, I have a personal preference to try and do things within the Workstation so that its more visible and accessible.

11-11-2014 07:48 PM

Nice one, guys!

Related Entries and Links

No Related Resource entered.