DX Operational Intelligence

 View Only
  • 1.  DX OI On-Premise: Why Alarms aren't propagating to Service

    Posted Sep 02, 2021 10:59 AM
    Hi All,

     I configured a Service on DX OI On-Premise but the alarms aren't propagating to Service.

     Here is an example:

    Service Overview. We can see the Risk is Severe and we can see "0" alarms on Alarms(+ subs) column :


    Topology for Service with components impacting the Service:

    The alarms on the component:


    In Alarm List there are no alarms:


    So, why alarms aren't propagating to Service. Do I need to configure something more?

    ------------------------------
    Regards,
    Alessandro
    ------------------------------


  • 2.  RE: DX OI On-Premise: Why Alarms aren't propagating to Service

    Broadcom Employee
    Posted Sep 06, 2021 01:52 AM
    Hi Alessandro

    I would check if all microservices/pods are running properly on your install. I would also check the logs from the service correlation pods:
    analyticsjobs, or soacorrelationengine
    Check these steps too:
    https://techdocs.broadcom.com/us/en/ca-enterprise-software/it-operations-management/digital-operational-intelligence/20-2/troubleshooting-consolidated/troubleshoot-service-analytics.html#concept.dita_429def2b2674edc32e557a6d7ed46e7fd2b35017_IWantToIncreasetheMemorySizeofServiceAnalyticComponents

    I would also recommend to open a Broadcom support ticket to further troubleshoot if above steps are not giving a clear clue of the issue

    Cheers
    Nestor



  • 3.  RE: DX OI On-Premise: Why Alarms aren't propagating to Service

    Posted Sep 08, 2021 09:47 AM

    Hi Nestor,


    Well, I think it will be long thread, I hope that you have time. : )


    "I would check if all microservices/pods are running properly on your install."


     When I saw at Cluster Management > Services, there are some services look like it.  I think it is strange because there is no status:


    At Kubernetes we have the pods running:


    But, the state is Terminated:



    ------------------------------
    Regards,
    Alessandro
    ------------------------------



  • 4.  RE: DX OI On-Premise: Why Alarms aren't propagating to Service

    Broadcom Employee
    Posted Sep 08, 2021 09:59 AM
    Hi Alessandro

    The Terminated status inside the pod is normal. As you can see, it means that one of the Init containers (required only during start-up) terminated with exit code 0 -- all good.
    Given all microservices appear as running I would rather check the logs of the pods I mentioned. 
    And, in parallel, open a support issue so we can track the investigation.

    Thanks
    Nestor


  • 5.  RE: DX OI On-Premise: Why Alarms aren't propagating to Service

    Posted Sep 10, 2021 04:21 PM
    Hi Nestor,

     Thank you for reply.

     When you say "I would also check the logs from the service correlation pods: analyticsjobs, or soacorrelationengine"

     I could not see the pods:

    [root@master-node ~]# kubectl get pods -ndxi | grep analytics
    [root@master-node ~]# kubectl get pods -ndxi | grep soacorrelation

    The name of them is analyticsjobs and soacorrelationengine?

    In parallel I also opened a case with the same title of this thread.


    ------------------------------
    Regards,
    Alessandro
    ------------------------------



  • 6.  RE: DX OI On-Premise: Why Alarms aren't propagating to Service

    Broadcom Employee
    Posted Sep 13, 2021 03:53 AM
    Hi Alessandro

    if you cannot see those pods it could mean they did not start for some reason. Can you check the logs of the associated deployments to see why they were not started? (e.g. lack of memory, OSE node issues...)
    Those pods are responsible for alarm correlation/propagation, so that explains why you are seeing the issue.
    In any case, with the ticket open, I am sure you will solve this quickly
    Cheers
    Nestor