Symantec Access Management

Expand all | Collapse all

Apache Servers Being Marked Down by F5 VIP

  • 1.  Apache Servers Being Marked Down by F5 VIP

    Posted 03-30-2018 08:37 AM

    Team,

     

    We have three Apache Servers behind a F5 VIP (LB) connecting to two Policy Server's. We have interval Poll at F5 set at 17 Seconds, and PS Poll Interval in ACO set at 30 Seconds. F5 has a three time Heart Beat check , that is 52 seconds before marking the Member down in a Pool. As part of our Monthly Security patches on our Linux server's we reboot the Policy servers once a month. We reboot one policy server at a time with 30 minutes apart between the two policy servers.

     

    The F5 has a protected heart beat, and waits for a confirmation of 200 code, to mark the member as up. We are seeing that as part of the policy server reboot, the F5 is marking one of the apache down ,as it is getting a HTTP 500 Error (Internal Server Error), though the other Policy Server is up and running.

     

    Can you please let us know if we can do anything in Siteminder to avoid this.

     

    Thanks,

    Avi



  • 2.  Re: Apache Servers Being Marked Down by F5 VIP

    Posted 03-30-2018 11:36 AM

    Avi, You can start with Apache web server error log, examine it and map the time of error into Apache agent log/ trace and policy server log/ trace to see if any more specific clues can be found. 500 error is a very general error.

    Thanks & Rgds, - Vijay



  • 3.  Re: Apache Servers Being Marked Down by F5 VIP

    Posted 03-30-2018 01:02 PM

    Team,

     

    The logs got rolled over, but when we took a look at the logs also it did not tell much, one thing to be noted is that, the LLAWP Process ID did not change, or there was no knowledge of LLAWP restarting in the logs.

     

    Thanks,

    Avi



  • 4.  Re: Apache Servers Being Marked Down by F5 VIP

    Posted 03-30-2018 12:52 PM

    Avi,

     

    What is the version of webagent ?

     

     

    Are the Policyservers in LB or Failover mode,(HCO setting) ?

     

    Is this happening only when restarting the second PS (as listed in HCO) ?  



  • 5.  Re: Apache Servers Being Marked Down by F5 VIP

    Posted 03-30-2018 01:00 PM

    Team,

     

    Please find the answers below:

     

    What is the version of webagent ? - R 12.5 CR 4

     

     

    Are the Policyservers in LB or Failover mode,(HCO setting) ?- It is in Fail Over Mode

     

    Is this happening only when restarting the second PS (as listed in HCO) ?  - This is happening when we restart the first server also.

     

    Thanks,

    Avi



  • 6.  Re: Apache Servers Being Marked Down by F5 VIP

    Posted 03-30-2018 01:23 PM

    It sounds to me, your PS failover is not happening ? Was this failover setup tested earlier in this environment ?

     

    Are you using the right HCO in SMHost.conf file which lists both the Policyservers ?

     

    By any chance, did you add the second PS server recently to the HCO and have not restarted webagents after the change ?

     

    If you verified all the basic configurations and everything looks good and still facing issue, I would suggest to open a support ticket as there could be multiple reasons for this failure. I found the below defect which may be related to your issue, just wanted to highlight it here.

     

    Defects Fixed in 12.52 SP1 CR06 - CA Single Sign-On - 12.52 SP1 - CA Technologies Documentation 

     

    00216581DE143166

    Web Agent is not failing back to the first Policy Server and requests are not processed successfully when starting the first Policy Server.



  • 7.  Re: Apache Servers Being Marked Down by F5 VIP

    Posted 03-30-2018 01:38 PM

    Thanks a lot Team. I have opened a Support Ticket also. This is the Fail over Setup we have done:

    With Failover Threshold Percent set to 0%. This is  the same in the lower environment also. Should I go ahead and change this? I think this has been happening in the past also, but we did not have Monitoring in Place, at that point of time, and last month we have set up Monitoring for this alert and that is the reason this is being noticed I think so.

     

    Cluster

    Failover Threshold

    Policy Server 01,44443;Policy Server 02,44443

    1

     

    I will also go over the defect List. Thanks a  lot.

     

    Thanks,

    Avi