Symantec Access Management

 View Only
  • 1.  Error on the logs

    Posted Sep 15, 2022 07:36 AM
    Hi Experts,

    User been reporting a white screen issue and noticed below error on their apache logs

    [Wed Sep 07 18:16:01.526686 2022] [mpm_worker:error] [pid 57663:tid 140594197165888] AH00286: server reached MaxRequestWorkers setting, consider raising the MaxRequestWorkers setting

    As per my understanding, it could be a high traffic of user and the server can't handle the request and logged above error. Should I increase the maxrequestworkers/maxclients value to avoid such issue happened again or there is any other possibility for that error log. Below are the setting of mpm worker on user end.

    <IfModule mpm_worker_module>

    ServerLimit 64
    StartServers 4
    MaxClients 1024
    MinSpareThreads 48
    MaxSpareThreads 160
    ThreadsPerChild 16
    MaxRequestsPerChild 10000

    </IfModule>

    Besides that, below error logged in the agent log as well

    [60895/2296358656][Wed Sep 07 2022 18:03:53][CSmLowLevelAgent.cpp:2892][ERROR][sm-AgentFramework-00520] LLA: SiteMinder Agent Api function failed - 'Sm_AgentApi_AuthorizeEx' returned '-1'.
    [60895/2296358656][Wed Sep 07 2022 18:03:53][CSmAuthorizationManager.cpp:166][ERROR][sm-AgentFramework-00420] HLA: Component reported fatal error: 'Low Level Agent'.
    [60895/2296358656][Wed Sep 07 2022 18:03:53][CSmHighLevelAgent.cpp:831][ERROR][sm-AgentFramework-00420] HLA: Component reported fatal error: 'Authorization Manager'.

    Appreciate any advise on this

    Thank you,
    Atifah


  • 2.  RE: Error on the logs

    Posted Sep 16, 2022 03:19 AM
    Hi,

    Anyone have any advise on this?

    Thank you,
    Atifah


  • 3.  RE: Error on the logs

    Broadcom Employee
    Posted Sep 19, 2022 03:14 PM
    Edited by Brian Dyson Sep 19, 2022 03:16 PM

    Atifah,

    While it is certainly possible to raise the MaxRequestWorkers value, the error message in the Apache HTTPD error_log is normally only a symptom of a deeper problem. Increasing the MaxRequestWorkers value may not truly contribute to actually solving the root cause.

    It is more likely that the problem is high latency happening on other systems. These other systems are usually external LDAP directories like Active Directory or external databases that the SiteMinder Policy Server relies on. When SiteMinder components like the Access Gateway, Web Agents, and the Policy Servers experience high latencies, these are often due to other backend systems and not due to the SiteMinder components themselves.

    It is best to follow a "facts and evidence" approach to the problem and follow the observed latencies across the systems by analyzing the trace logs.

    For example, if using the SiteMinder Access Gateway to forward/proxy requests after protecting them with SiteMinder, then analyze the Access Gateway agent trace log (configure one in the Agent Configuration Object (ACO)). Check the transactions to see where most of the time is spent - communicating between the Access Gateway and the SiteMinder Policy Server (normally ~1-10ms) or between the Access Gateway and the proxy/forward endpoint.

    If it seems that the communications between the Access Gateway/Web Agent and the Policy Server are taking a long time, then compare and correlate the transaction with the Policy Server trace log. Identify the portions of the Policy Server trace log for the transaction where a long time is spent waiting for responses from external resources like databases and/or LDAP directories. Then focus on those end systems to understand why they are performing poorly.

    Based on the analysis, following the evidence leads to other systems to understand the root cause.

    For the errors observed in the agent log, correlate the timestamps to the SiteMinder Policy Server trace log to try and understand the root cause in more detail.

    In my experience, the problems are usually on some external backend system that performs poorly and the SiteMinder components are victims too, experiencing the same high-latency symptoms as the users and their browsers.

    Some specific suggestions:

    • Periodic Statistics Logging: Enable the policy server to periodically log statistics information to the smps.log file.
      This can be configured to occur every 60 seconds for relatively fine detail on server statistics. The statistics information can be monitored. Monitor the queue depths (normal-priority queue for request handling, high-priority queue for agent-specific communication), busy threads, and total number of agent connections.
    • Review SMPS Log: Review the smps.log file for warnings, errors, and "Execution time exceeded" informational messages.
    • Log HTTP Duration in Access Log: Modify the HTTPD access log to record the duration to handle the request. For Apache HTTPD this is the "%D" LogFormat option to add the request duration in microseconds to the request log. Do this on front-end systems, including the Access Gateway Apache HTTPD component, and any backend systems that the Access Gateway proxies to.
    • Transaction ID in Backend Access Log: If using SiteMinder Access Gateway, add the SM_TRANSACTIONID header variable to the access log to support correlating SiteMinder Access Gateway, SiteMinder Policy Server, and backend application logs.
    • Bisect the Problem: Apply a bisection technique to try and split the latency into two halves and then focus on which "half" takes the longest. Continue applying the technique until the root cause if fully understood. Apply a "Scientific Method" approach to the problem as well with a hypothesis/conjecture of what the problem might be, examination of experimental results based on past trace logs or executing measurable experiments in a controlled way, and identify of evidence that either confirms or denies the conjecture.
    • Monitor Hardware and OS: Use tools like the sysstat set of utilities to monitor the hardware and OS including processor utilization and utilization components (user, system, io wait, idle, etc), disk metrics like read/write latency, and network metrics like throughput. Do not ignore virtualization effects too when running on virtual machines that are over-provisioned to the hypervisor and compete for limited resources.

    Hope this helps.

    Brian Dyson
    IAM Solution Engineering




  • 4.  RE: Error on the logs

    Posted Sep 19, 2022 10:17 PM
    Hi Brian,

    Thank you for your advise and suggestion. Really appreciate it.

    Regards,
    Atifah