I am finding with a client utilizing the SSO WebAgent R12 that log rotation is interrupting the Oracle HTTP Server (OHS) during the rotation such that OHS does not serve responses until the WebAgent has completed rotating the logs.
This behavior was occurring frequently, however seems to have settled to once per hour, despite the log size limit not having been exceeded. This applies to both standard logs and trace logs that are enabled.
My question is whether it is normal behavior for the WebAgent log rotations to be interrupting the OHS completely for the 10-15s it takes to restart the WebAgent process and whether this has been encountered by anyone else.
WebLogic Version: 22.214.171.124.0
OHS Version: 10.3.6.0.160419
SSO WebAgent: R12.0QMR3cr12_912c-33975
If anyone has any experience of similar, tips or pointers it'd be very much appreciated.
Not clear if you are using a script to rotate the logs or standard web agent parameters (either Local or ACO)?
I have found that using web agent paraments works best . . .
Within either the Agent Configuration Object (Policy Server) or LocalAgent.conf (Web Server)
LogFileSize=xx < sets the maximum Log File Size before rolling (i.e. 50 = 50 MB)
LogFilesToKeep=xx < maximum number of log files to keep before truncating (i.e. 10 = 10 + 1~current)
TraceFileSize=xx < sets the maximum Trace Log File Size before rolling (50 = 50 MB)
TraceFilesToKee=xx < maximum number of log files to keep before truncating (10 = 10 + 1~current)
NOTE: Above will consume a max of 549 MB disk usage.
It may be worth noting that Agent Log files are normally not that large, so 10 MB file size may be adequate. However, Trace files contain useful troubleshooting data - but keeping the at 50 MB makes diagnosing their content easier. You may need to set the number of trace files to keep at 20-30 (1G-1.5G disk usage), I find 25 trace files at 50 MB can cover 5-7 days worth of data. You may need to examine to determine what works best in your case. Also, you can always use a script to copy the oldest (2-4) logs (name-date-time.log) to an archive folder.
More details can be found here <List of Agent Configuration Parameters - CA Single Sign-On - 12.52 SP1 - CA Technologies Documentation>
NOTE: You may want to consider upgrading the a new version of SiteMinder < not meant to be a sales pitch.
Kirk (Leslie Kuykendall)
First of all, its NOT normal for webagent to take 10-15 second to rotate log..that's huge.
Have you tried investigating why is there so much delay in rotating logs?
This will probably needs some OS level tracing to identify where the delay is .. like strace log for unix ..
Best to open support case.
Hi Ujwol & Leslie,
Thank you both for the suggestions, I had a support ticket open in parallel but thought I'd try the community too.
I do have an update to share but would like to write up some notes in case anyone else runs into the issue, but the short version for now is that the OHS processes were being monitored by OPMN which was restarting OHS due to its health check pings were not being responded.
This was a side effect of the client's customized registration script picking up the localhost entry for the health pings in the httpd.conf and refusing to register, we had disabled the entry without suitably taking into account the ramification for OPMN.
SiteMinder was dutifully responding to the shutdown requests and rotating its logs on the shutdown, this was a symptom instead of cause so the agent wasn't responsible, while we did encounter ocassional locking with LLAWP processes we think that was from running the logging levels up for debugging.