CA Service Management

Expand all | Collapse all

SDM 17.1.0.1 webengine crash config sdm1 and sdm2

  • 1.  SDM 17.1.0.1 webengine crash config sdm1 and sdm2

    Posted 09-10-2018 03:59 PM

    hello community!
    is there anyone who is living this problem with SDM 17.1.0.1
    We have a primary and secondary configuration on windows server 2016.
    3 web engines on the secondary by sso.

    Intermittently and randomly, one of the web engine crashes.

     

    stdlog on primary;

    sdm-app1-prod  pdm_d_mgr     11416 ERROR   daemon_obj.c    1990 Daemon _web_eng_sdm_app2_prod2 died: restarting

    At the same time on the secondary we have an application error event 1000 

    application Error event 1000
    Faulting application name: webengine.exe, version: 17.1.0.178, time stamp: 0x5af99814
    Faulting module name: webengine.exe, version: 17.1.0.178, time stamp: 0x5af99814

     

    Not much more info in stdlog....



  • 2.  Re: SDM 17.1.0.1 webengine crash config sdm1 and sdm2

    Broadcom Employee
    Posted 09-11-2018 12:11 AM

    GuillaumeM 

    Sounds like a CA Support case would be the best path forward so that additional tracing and crash dump can be enabled on the WebEngine process.



  • 3.  Re: SDM 17.1.0.1 webengine crash config sdm1 and sdm2

    Broadcom Employee
    Posted 09-12-2018 04:17 AM

    In order for the engineering development team to diagnose a crash of an CA SDM processes, we have some guidance on information that needs to be collected from the environment in the following knowledge articles:

     

    1. Title:  Using the Microsoft Debug Diag Tool to generate dump files for a crashing or hanging process

    URL Link: KB000019092

    2. Working with CA Support To Troubleshoot a "Crashing" or "Hanging" CA Service Desk Manager Process

    URL Link: KB000020303



  • 4.  Re: SDM 17.1.0.1 webengine crash config sdm1 and sdm2

    Posted 01-23-2019 10:27 AM

    thanks to Raghu Rudraraju on this one....very deep..

    <<Engineering finally has identified the root cause behind this one. CAPKI5 (which we have been using since 17.0) has a restriction where the library libcaopenssl_crypto.dll must be loaded at a specific address to be FIPS compliant. This restriction is from OpenSSL. On re-analysis of the dumps again, the identified that the library was not being loaded at the specific address and hence the etpki_lib_init was failing. If a process obtains more than 250MB of heap memory before calling etpki_lib_init, the call will fail. They were able to simulate the problem in-house by allocating 250MB memory during startup of these processes. We have also identified the solution for this problem. It is to explicitly load CAPKI DLLs when the process starts so that the openssl dll always gets loaded at required address location before any memory is dynamically allocated. So, they prepared a debug fix T5U3504 which can be applied on 17.1. The fix was prepared on latest code base but should not create any issues when applied on 17.1 too.>>

     

    Not install yet, but sound very hopeful...

     



  • 5.  Re: SDM 17.1.0.1 webengine crash config sdm1 and sdm2

    Posted 09-11-2018 09:06 AM

    Thanks

    I already have an open case for that.

    No answer yet.

    But as we say: Together we all know..



  • 6.  Re: SDM 17.1.0.1 webengine crash config sdm1 and sdm2

    Broadcom Employee
    Posted 09-12-2018 11:08 AM

    I am curious on what caused the webengine crashed...

    Maybe memory leak that makes webengine.exe becomes too big, maybe some in-compatible dlls...

    Do you mind let guys know the case number? Thanks _Chi



  • 7.  Re: SDM 17.1.0.1 webengine crash config sdm1 and sdm2

    Posted 09-12-2018 02:38 PM

    01187587 and 01097264



  • 8.  Re: SDM 17.1.0.1 webengine crash config sdm1 and sdm2

    Broadcom Employee
    Posted 09-12-2018 03:02 PM

    Thanks Pier-Olivier.

    The error seems pointing to " ERROR encrypt.cpp 520 etpki_lib_init return -1".

    Let's see the output from the case. I don't fully understand why in the middle of webengine operation all of sudden this capki thing pops up. It would be easier to understand if it shows up during upgrade/configure process.