DX NetOps

 View Only

 DX NetOps and Heap Memory (#2)

Jump to  Best Answer
MARUBUN SUPPORT's profile image
MARUBUN SUPPORT posted Sep 11, 2024 04:34 AM

Hi Team,

I have a question regarding "DX NetOps and Heap Memory" ( https://community.broadcom.com/question/dx-netops-and-heap-memory ), which you answered previously.

I would be grateful if you could answer following question.


[Product]
DX NetOps PM 22.2.2
Linux RHEL7

[Questions]
As the heap space usage rate is reaching around 90%, end users check the performance-spool capacity and find that it is over 10MB.
When they check the performance-spool capacity a few minutes later, it is below 10MB, so they believe that the capacity is prone to fluctuations.

Could you please tell me how much free capacity is in the performance-spool in a safe zone where the service will not be affected even if the DA service is restarted?


Best Regards,
Marubun Support

Jeffrey Pinard's profile image
Broadcom Employee Jeffrey Pinard  Best Answer

1. About performance-spool file size

From your answer, the customer thinks that it is not possible to judge the possibility of data loss from the free space in the performance-spool.
Is this idea correct?

Correct, if performance-spool usage is going up in usage, you may have a data loading issue more than you have a missed poll issue, or DA/DC are catching up for being disconnected.  The DC self monitoring metrics are probably more helpful for determining if overall polling is down.  Metrics calc per sec, poll item count, etc.

2. "performance-spool should be at least 2-3G per MF"

(1) Based on this advice, does it mean that 400MB of performance-spool capacity is insufficient, and do they need to ensure that the performance-spool capacity is around 2 to 3 GB?

Correct, 400MB is not near enough, especially if it's being shared by the rest of IMDataAggregator directory.  We have logs that can take up space, etc.  Plus depending on scale or catching up, we could use 400MB in 1 file to load into DB.

(2) If they need to increase the spool capacity to 2-3 GB, please tell me how to increase the spool capacity.

I would suggest 2G per MF they poll as a minimum, if performance-spool is it's own directory, but like 20-25G free if we're talking all of IMDataAggregator to handle logs, and other files that can change over time.

(3) By the way, could you tell me what MF means?

metric family.  Interface, CPU, Memory, etc are all metric families we collect metrics for.

Jeffrey Pinard's profile image
Broadcom Employee Jeffrey Pinard

Normally, we allow upto 400MB per performance-spool file before we cut it off and load it into the DB.

If the metric family doesn't use 400MB before the poll period is done, we cut off the file and load it into the DB.

So file can be anywhere from 0-400MB depending how fast data comes in and when 5 min poll period hits.

So 10MB is nothing when it comes to disk usage.

AS for the 90% heap, you really want to look at application pause/throughput graphs on DA health dashboard.  If that is under 1-3 secs every 5 mins then you are fine.  if that is like 5-6+ seconds, you may want to either lower your poll item count, or add more memory to the DA process.

MARUBUN SUPPORT's profile image
MARUBUN SUPPORT

Hi Jeffrey-san,

Thank you for your detailed answer and my customer understood your answer.
By the way, they asked me the question related to your answer.
I apologize for bothering you during your busy schedule, but I would appreciate an answer or advice.

[Question]
The system they operate collects traffic from approximately 4,000 machines.
During system operation, they sometimes restart the dadeamon service.
They are concerned that when the service is restarted, the following problems may occur.
   - Polling data may be lost
   - The dadeamon service may go down and become unable to start.

For this reason, they check the free space in "performance-spool" before restarting the dadeamon service.
Please let us know if there are any conditions for checking "performance-spool", such as a condition that the free space must be 300MB or more.
Also, please let us know if there are any other points that we should check.


Best Regards,
Marubun Corporation

Jeffrey Pinard's profile image
Broadcom Employee Jeffrey Pinard

When the DA is being restarted, as long as the DCs are up, they continue to poll the 4000 devices.  When the DA is fully up and DCs reconnect, the DC will send the old oldest data it has in it's cache until current.  it'll take some time to catch up with all poll data.  Check DCs page to see if DC is in Catching Up or Connected (all caught up) state.   

The DC cache is based on 1/2 memory allocated to DC process.  How much data DC can cache without poll gaps is dependent on that amount and how long DA is down for.  The cache can be increased via a config file if needed to cache longer.

Yes, depending how fast the DCs can send the cached data, the .dto files can reach 400 MB fast and then be marked for loading into DB.
Once loaded into DB, the .dto is deleted.

performance-spool should be at least 2-3G per MF being polled, IMO.  SO it can handle these catch up periods depending how long DA/DC are not connected.

MARUBUN SUPPORT's profile image
MARUBUN SUPPORT

Hi Jeffrey-san,

Thank you very much for your detailed answer.
I would like to confirm a few points regarding your answer.
I apologize for bothering you during your busy schedule, but I would be very grateful if you could answer my question.

1. About performance-spool file size

From your answer, the customer thinks that it is not possible to judge the possibility of data loss from the free space in the performance-spool.
Is this idea correct?

If there is a guideline for determining the performance-spool capacity, please let me know.


2. "performance-spool should be at least 2-3G per MF"

At the end of your answer, I received the following advice.
   > performance-spool should be at least 2-3G per MF being polled, IMO.  
   > SO it can handle these catch up periods depending how long DA/DC are not connected.

(1) Based on this advice, does it mean that 400MB of performance-spool capacity is insufficient, and do they need to ensure that the performance-spool capacity is around 2 to 3 GB?

(2) If they need to increase the spool capacity to 2-3 GB, please tell me how to increase the spool capacity.

(3) By the way, could you tell me what MF means?


Best Regards,
Toshiyuki Hayakawa
Marubun Corporation