DX Application Performance Management

View Only

Back to discussions

Expand all | Collapse all

Disconnected Historical Agent Limit

Jump to Best Answer

1. Disconnected Historical Agent Limit

2 Recommend
Billy Cole
Posted Aug 16, 2018 10:31 AM

Reply Reply Privately
APM Version : 10.5.2.92
Total Agents: 1580
Number of Collectors: 9

On the APM Status Console, we have an active clamp on one of our collectors:
introscope.enterprisemanager.disconnected.historical.agent.limit collector009@5001 400 400 5:36:27 07/22/18
The number of agents on the nine collectors vary between 118 - 250. Typically once a month, the APM is restarted to pick up OS patches and during that time, the agents will jump between collectors during start up then will level out after a few hours of load-balancing. The problem with that is, during this process a collector may have had over 400 agents connect and disconnect till the agents are load balanced.

Question time:

1. What is the behavior of a collector that has a historical.agent.limit clamp (400)?

2. Depending on the resulting behavior of a collector, is this really an error level clamp/limit?

3. Again, depending on the resulting behavior, is there a way to address this issue without resorting to assigning agents to specific collector or collectors, basically artificially dividing the agents into groups of less than 400 per collector?

4. What is the APM cluster impact when one, or more of the collectors are reporting the disconnected.historical.agent.limit? In the nut-shell, when do I as the APM admin need to take preventative actions?
2. Re: Disconnected Historical Agent Limit
Best Answer

3 Recommend
Broadcom Employee

Thomas Wyszomierski
Posted Aug 16, 2018 12:42 PM

Reply Reply Privately
The following reference might be helpful. The section "introscope.enterprisemanager.disconnected.historical.agent.limit" describes why this clamp occurs, how the EM responds, and has some suggestions on how to address.
apm-events-thresholds-config.xml - CA Application Performance Management - 10.5 - CA Technologies Documentation

Just some suggestions to check:

"If there is no historical agent that the Enterprise Manager can automatically unmount, this means that CA APM users mounted manually all the disconnected historical agents. The Enterprise Manager never tries to unmount a disconnected historical agent that a CA APM user mounted manually."

Would you have mounted any agents manually?

"
Workstation displays an error message instructing the user to unmount some historical agents to make room to mount new historical agents."
Would you see such a message?

My suggestion would be to try to use introscope.enterprisemanager.loadbalancing.staywithhistoricalcollector=always to prevent the clamp issue during the startup.

I have covered the most common issues and recommendations regarding clustering in this KB
Introscope Enterprise Manager Troubleshooting and - CA Knowledge

See point # 15 that covers loadbalancing
3. Re: Disconnected Historical Agent Limit

0 Recommend
Billy Cole
Posted Aug 16, 2018 03:13 PM

Reply Reply Privately
So, read through the doc, and thought, let us go look for these disconnected and mounted, agents, which I would typically go to the custom metric host to locate any agent that is grayed out.

*SuperDomain*|Custom Metric Host (Virtual)|Custom Metric Process (Virtual)|Custom Metric Agent (Virtual) (collector009.aessuccess.org@5001)|Agents

To my surprise, there are no agents that are grayed out. Then clicked on the Agents folder and did a search for "ConnectionStatus" and all have a value of 1.

I would expect to see a hundred or so, mounted but disconnected (greyed out) agents under the Agents folder but I don't. The collector has around 177 total agents but the APM status console is still reporting the active clamp on the

introscope.enterprisemanager.disconnected.historical.agent.limit collector@5001 400 400 5:36:27 07/22/18

I couldn't find anywhere on the custom metric host where there was a metric that I could use to gauge against the APM console.

Looking at the clamp line, the clamp occurred on July 22. Our agents unmounts after 24 hours of being disconnected.

So it looks like a ghost message that is stuck in APM status console since I am not able to see any signs that the collector associated to the active clamp has the specific condition.

Anyone know how to kick the APM console so it will check again and clear the message?
4. Re: Disconnected Historical Agent Limit

2 Recommend
Legacy User
Posted Aug 16, 2018 04:39 PM

Reply Reply Privately
Just some suggestions to check:

"If there is no historical agent that the Enterprise Manager can automatically unmount, this means that CA APM users mounted manually all the disconnected historical agents. The Enterprise Manager never tries to unmount a disconnected historical agent that a CA APM user mounted manually."

Would you have mounted any agents manually?

"
Workstation displays an error message instructing the user to unmount some historical agents to make room to mount new historical agents."
Would you see such a message?

Francis
5. Re: Disconnected Historical Agent Limit

0 Recommend
Billy Cole
Posted Aug 17, 2018 06:44 AM

Reply Reply Privately
Thank you Francis.

There are only two people with administration rights which includes mounting and unmounting agents. Neither of us have mounted or unmounted any agents.

We haven't seen any messages on unmounting historic agents to make room.

Thank you again,

Billy
6. Re: Disconnected Historical Agent Limit

3 Recommend
Broadcom Employee

Sergio Morales Correa
Posted Aug 17, 2018 05:23 AM

Reply Reply Privately
Hi Billy,
My suggestion would be to try to use introscope.enterprisemanager.loadbalancing.staywithhistoricalcollector=always to prevent the clamp issue during the startup.

I have covered the most common issues and recommendations regarding clustering in this KB
Introscope Enterprise Manager Troubleshooting and - CA Knowledge

See point # 15 that covers loadbalancing

I hope this helps,
Regards,
Sergio
7. Re: Disconnected Historical Agent Limit

0 Recommend
Billy Cole
Posted Aug 17, 2018 07:02 AM

Reply Reply Privately
Thank you Sergio.

I will be going through the KB line for line against our new 10.5.2 cluster.

On the load balance, I could see where setting to staywithhistoricalcollector to always would help, but currently it does not appear like the collector that has the warning about the historic agent limit has any disconnected mounted agents. So it appears like the APM Status Console thinks there clamp of historic agents but there does not appear to be.

The setting to always, how does that impact failures, if one of the collectors were to fail or unable to accept agents, will the agents move to a different collector till the failed collector returned to service?

The message was from July 22, and I can reason it was due to a cluster restart. I would expect that after 24 hours, the disconnected agents would unmount due to the introscope.enterprisemanager.autoUnmountDelayInMinutes=1440, and I'm guessing that the historic.agent.limit shouldn't count unmounted agents.

On a cluster restart, I would expect to see quite a few APM Status Console messages, which should clear after the cluster has became balanced plus 24 hours to unmount the disconnected agents.

Thank you,

Billy
8. Re: Disconnected Historical Agent Limit

1 Recommend
Broadcom Employee

Lynn Williams
Posted Aug 19, 2018 06:55 PM

Reply Reply Privately
Hi Billy,
Sergio also just updated this older KB covering that property: Tip for loadbalancing configuration when upgrading - CA Knowledge
Sergio can confirm but I believe introscope.enterprisemanager.loadbalancing.staywithhistoricalcollector=always means that:
- as long as the Collector is up the agent will wait for a connection to it even if if it is overloaded.
- if the Collector is down the agent will be redirected to another Collector

Hope that helps

Regards,

Lynn
9. Re: Disconnected Historical Agent Limit

0 Recommend
Billy Cole
Posted Aug 20, 2018 01:41 PM

Reply Reply Privately
Thank you Lynn.

Now, I'm a bit confused how this setting might help in my case since at least once a month, all collectors are stopped which would trigger the second clause. Then during starting the collectors, the agents more than likely will get to the collector that is the one it is trying on it's original collector list before the collectors/MOM were shutdown (MOM first, so that the collector list is not updated).

Could it be that this specific APM Status Console message/alert has the alert trigger set to "Whenever Severity Increases" and not to "Whenever Severity Changes", thus no clearing the active clamp since I do not see any metrics showing that the collector has more than a few hundred active agents and don't really see any metrics on the custom metric host that might be a historic agent count. But then again, I could be missing the metric driving the active clamp alert.

Regards,

Billy
10. Re: Disconnected Historical Agent Limit

0 Recommend
Broadcom Employee

Lynn Williams
Posted Aug 20, 2018 09:07 PM

Reply Reply Privately
Hi Billy,
SergioMorales for any more input he may have.
I am thinking that even though agents might initially get to their preferred Collector during the startup process, as the MOM tries to load balance the metric load across the Collectors during the dynamic startup period the suggested setting could still help to avoid agents being subsequently moved around Collectors.
Regarding the alert not clearing from APM Status Console perhaps you can create a support case on that so we can research it in more detail as to why the alert is not clearing.

Thanks

Lynn
11. Re: Disconnected Historical Agent Limit

1 Recommend
Billy Cole
Posted Aug 21, 2018 07:57 AM

Reply Reply Privately
Thank you Lynn.

I have opened support case 01171999
12. Re: Disconnected Historical Agent Limit

1 Recommend
Broadcom Employee

Hallett German
Posted Aug 17, 2018 08:33 AM

Reply Reply Privately
Dear Billy:
It is always great to see a good conversation taking place. I combined Tom's, Sergio's and Francis's responses into one response

Thanks
Hal
13. Re: Disconnected Historical Agent Limit

3 Recommend
Billy Cole
Posted Sep 18, 2018 10:49 AM

Reply Reply Privately
Support Case: 01171999
APM Status Console - reporting "disconnected historic agent limit" - active clamp

We found that there was a reporting/refresh issue with the APM Status Console, where when agents stopped reporting, and there were more than the out of the box setting of 400, the active clamp alert would appear and would not clear even after the agents had unmounted.

1. We deployed a revised "/product/enterprisemanager/plugins/com.wily.introscope.em_10.5.2.jar"

2. Set the File: apm-events-thresholds-config.xml "introscope.enterprisemanager.disconnected.historical.agent.limit" threshold value="1"

3. Shutdown an epagent

4. Manually unmounted the agent

5. After a short bit, the APM Status Console cleared the active clamp notice.

Hope this helps,

Billy
14. Re: Disconnected Historical Agent Limit

0 Recommend
Broadcom Employee

Lynn Williams
Posted Sep 18, 2018 06:18 PM

Reply Reply Privately
Thank-you Billy Cole for letting the Community know

DX Application Performance Management

Disconnected Historical Agent Limit

Billy ColeAug 16, 2018 10:31 AM

Thomas WyszomierskiAug 16, 2018 12:42 PMBest Answer

Billy ColeAug 16, 2018 03:13 PM

Legacy UserAug 16, 2018 04:39 PM

Billy ColeAug 17, 2018 06:44 AM

Sergio Morales CorreaAug 17, 2018 05:23 AM

Billy ColeAug 17, 2018 07:02 AM

Lynn WilliamsAug 19, 2018 06:55 PM

Billy ColeAug 20, 2018 01:41 PM

Lynn WilliamsAug 20, 2018 09:07 PM

Billy ColeAug 21, 2018 07:57 AM

Hallett GermanAug 17, 2018 08:33 AM

Billy ColeSep 18, 2018 10:49 AM

Lynn WilliamsSep 18, 2018 06:18 PM

1. Disconnected Historical Agent Limit

2. Re: Disconnected Historical Agent Limit Best Answer

3. Re: Disconnected Historical Agent Limit

4. Re: Disconnected Historical Agent Limit

5. Re: Disconnected Historical Agent Limit

6. Re: Disconnected Historical Agent Limit

7. Re: Disconnected Historical Agent Limit

8. Re: Disconnected Historical Agent Limit

9. Re: Disconnected Historical Agent Limit

10. Re: Disconnected Historical Agent Limit

11. Re: Disconnected Historical Agent Limit

12. Re: Disconnected Historical Agent Limit

13. Re: Disconnected Historical Agent Limit

14. Re: Disconnected Historical Agent Limit

2. Re: Disconnected Historical Agent Limit
Best Answer