Over the last two or three years, the number of entries in our Production OPSLOG has increased to the point that on some days I'm unable to capture a full 24 hours for archival. I would like know if there are any published solutions that assists with moving OPSLOG to a larger capacity DIV data set?
After briefly looking at the last tickets we have worked together I believe you are currently running CA OPS/MVS release 12.2.
If this information is correct then visit OPSVIEW option 4.13 and let us know what is the BrowseMax value set for your live OPSLOG DIV you are using right now to record events.
Next follow this URL link:
DASD Requirements for OPSLOG Messages - CA OPS/MVS® Event Management and Automation - 12.2 - CA Technologies Documentati…
Check if the BrowseMax value set for your OPSLOG DIV file when it was first used corresponds with the space allocation documented in the link above. If you do then please let us know what those two values are set.
We will evaluate this information first in preparation for your OPSLOG DIV size increasing project.
You are correct, we are currently running r12.2 of OPS/MVS. The BrowseMax value is 800000. Total Cylinders allocated for OPSLOG is 568.
As you can see in the chart you can go higher than our installation defaults.
Consider your current event traffic and what your target should be. Then look at what value for BrowseMax you should go and the provision in 3390 cylinder space allocation you need to procure from your DASD admins. Then proceed to allocate a OPSLOG DIV using the sample REXX program located in CCLXECNTL(DEFDIV) member.
Once it is allocated then let me know if you wish to switch to the new OPSLOG DIV file dynamically or you want to take some down time to perform the maintenance. Also confirm if the member OPSSPAxx member allocates your live OPSLOG DIV file using the Address Opsctl OPSLOG define statements or you have a OPSLOG DD name in the OPSMAIN task instead.
Currently we allocate the OPSLOG through OPSSPAxx member, not the OPSMAIN Started Task. I have been through testing the process of allocating another OPSLOG, then switching to it using commands and the 4.13 OPSVIEW Panel. If I allocate an additional OPSLOG, then switch to it, messages on the previous OPSLOG won’t be moved. Is there a way to move them to the new OPSLOG?
This all sounds like progress to me.
To consolidate OPSLOG data we normally refer clients to the Merge OPSLOG utility first introduced under release 12.2.
Check out how you can do this from OPSLOG Archive files via a batch job:
How to Merge OPSLOG Archive Data Sets - CA OPS/MVS® Event Management and Automation - 12.2 - CA Technologies Documentati…
Also, check out how you can to do it form Live OPSLOG Data from multiple systems
How to Merge Live OPSLOG Data from Multiple Systems - CA OPS/MVS® Event Management and Automation - 12.2 - CA Technologi…
We generate 2.5M+ messages a day on our production system so I have sized our OPSLOG's accordingly. That said, there are days when our message production balloons over 4M messages, usually a trace was turned on for a task or something ran wild. I have implemented the OPSLOG archive procedure and keep 90 days worth of logs in the archive GDG. This way we can use them to research events on our system going back at least 3 months. I actively swap logs each day at midnight and produce an archive each day so then each day has a fresh log. The days with 4M+ records then use the Archive Trigger parameter to trigger a swap and archive.
I think it will be good if you share with Steve your BrowseMax and space cylinder allocation for your event traffic at your site.
That might give Steve a ballpark estimate on how large he should plan on allocating his new OPSLOG DIV file.
Thanks Cesar. I am interested in how Travi archives the OPSLOG and then swaps it.
Here ya go Steve. I set this up following the archive procedure for 12.1. I am finding that this method is still my preferred method for 12.2 as well since the auto-archive process forces you to choose number of messages or a time to archive. Since I still want to use the archive trigger to archive when we have those days that our cup runneth over I set the auto archive up to handle this based on message number. I then left the 12.1 swap-archive process in place. Here is my allocations and process I use:
BROWSEMAX 4000000 MESSAGES
BROWSEMAXINUSE 4000000 MESSAGES
BROWSEARCHIVEDSN GDG Base with 90 Generations
These may be a little overkill but better to be safe than sorry
OPSLOG1 3200 Cylinders
OPSLOG2 3200 Cylinders
OPSLOGA 3200 Cylinders - Backup log just incase something gets screwed up with 1 or 2.
Approx 400 Cylinders each
1. Created a GDG base containing 90 generations.
2. Set BROWSEARCHIVEDSN to GDG
3. Created TOD rule to swap the log.
(I had to hard code a primary and secondary log here because determining which log was currently writing information seemed to be problematic. I also coded it over a year ago so it could possibly use a rework now that I have more knowledge but we only use two logs and I haven't had to change their DD's at all.)
4. The swap generates message OPS4626O which then triggers a rule (ARCHMSG3 provided by CA).
5. This rule goes on to run ARCHSUB which then performs the archive.
Have you ever gone through the process to swap the OPSLOG when it reached a specific amount of entries?
The short answer is yes and the long answer is yes but with a caveat. The caveat is the only time I have done this is when my OPSLOG has reached its max. This is of course controlled by the ARCHIVETRIGGER parameter. However, I believe each time this occurred my ARCHIVETRIGGER parameter was set too high and resulted in some lost OPSLOG. I have since reduced the value to just under half of the BROWSEMAX and this should allow sufficient time for the log to archive.
As I wrote the above statements something occurred to me. I never actually swap my log when OPS4403O appears. All I do is grab the current OPSLOG and send it to the archive job (Default CA Rule). This may be the real reason things were failing. This also means that the archive job ends up running against the live OPSLOG and if memory serves, that doesn't work so well. So no matter what I set ARCHIVETRIGGER to, the archive job will never successfully complete and the log never actually gets "stripped" and reset and continues to just accumulate messages. What I think I am going to do is change the rule to perform a swap when it gets to the ARCHIVETRIGGER and issues message OPS4403O. This will then trigger the archive procedure because of message OPS4626O.
Disclaimer: The above methods are 12.1 based but should work in 12.2 as well. I haven't had much experience with 12.2 archiving as I have only just installed it on our maintenance LPAR a couple weeks ago.
I must have something set up incorrectly. I’m not seeing the message to trigger the archive. I have the ARCHIVETRIGGER parm set to 1000 for test purposes and no message is triggered when that number is reached.
See if the message OPS4403O has occurred in the past and if an OPSLOG Archive job has been run since then.
I believe if you run a manual OPSLOG Archive the message OPS4403O will be displayed again when you reach out the specified criteria one more time.
I just set mine to 1000 as well and disabled the rule to process message OPS4403O so I don't inadvertently initiate an archive. Upon changing it, no 4403O message immediately appeared. I wonder if it counts a thousand messages from the time it is changed. If you have VERY low activity or are not recording very many events in your OPSLOG it may take a long time to accumulate 1000 messages. I have almost every browse message turned on and 1000 messages took me 30 minutes to accumulate and I still did not get the message.
Another thought I had was to make sure you are modifying the proper parameter. I show an ARCHIVTRIG and an ARCHIVETRIGGER. The shorter one is for use with the started task version of the archive process (12.2) but I am not sure that it is actually valid anymore. I do know that OPS4403O should show up when the ARCHIVETRIGGER value is reached regardless of any other settings.
My final thought is that the ARCHIVETRIGGER my need a log swap to be updated. That is the live log may see your previous value of the ARCHIVETRIGGER and only a fresh log will pick up the new value. This kind of defeats the purpose of changing the parameter "anytime" as the documentation states but stranger things have happened. It is the end of my work day here and I need to get going so I won't get a chance to swap the logs to test this but if you do let me know. I am skeptical it will have an effect but for as much information as CA jams into an OPSLOG definition, it wouldn't surprise me if ARCHIVETRIGGER is embedded in there somewhere as well. I am going to leave my trigger set to 1000 over night since my logs will swap at midnight and I will see if the message appears in tomorrows log, so I guess that will be my test.
UPDATE: I checked my log from yesterday and the one from today and I still do not see the OPS4403O message despite having set my ARCHIVETRIGGER variable to 1000. My logs did swap last night so an archive job did run.
Let me clear up a couple items in the previous posts - the OPS4403O message contains the LOGNAME of the OPSLOG DIV that needs to be archived, if you manually issue that message insure the intended LOGNAME is the last word. If you switch OPSLOGs and need to archived the switched from OPSLOG, message OPS4626O is issued, again with the last word indicating the LOGNAME to archive. We supply rule ARCHMSG2 to handle the archive of switched OPSLOGs.
And if you are running OPS/MVS 12.2 or higher, you will need to restart the new OPSLOG archive subtask for any dynamic ARCHIVETRIGGER changes to take effect, F OPSS,restart(ARCH).
OPS/MVS also supplies rules and a REXX exec to handle the condition where the finite number of OPSLOG events is reaching it's maximum, indicating a new OPSLOG DIV is needed. When the OPSLOG events number reaches it's 80% capacity, message OPS3445O is issued (sample rule with the same member name) and a rule can be fired to invoke a REXX exec to preform the switch.
And the original question for moving OPSLOG to a larger DIV is to create/define a new DIV dataset with larger capacity following the guidelines in the Administrator Guide for number of cylinders needed to accomodate the number of events you need. Then you can either recycle OPS/MVS to 'pick' up the new DIV dataset, handling dataset name changes if needed OR add the new OPSLOG as a multiple OPSLOG via OPSVIEW 4.13 and then perform a switch. Then archive the switched from OPSLOG.
Hope this helps.
Question: The documentation says " The OPS4403O message is issued regardless of the value for the INITARCH parameter, but the OPS4403O message is not issued if the value of the ARCHIVETRIGGER is a time-of-day." This implies to me that the archive sub task has no bearing on the message being issued so why would restarting it need to be done to pick up the changes? As I said earlier, doesn't that defeat the purpose of being able to change the parameter at "ANYTIME" especially since the documentation says nothing about restarting the sub task after changing this parameter. I don't doubt what you say is true, I am just questioning why it would be true and why CA didn't document it that way.
REF: CA-OPS 12.2 Wiki - ARCHIVETRIGGER PARAMETER
I currently do not start the archive sub task at all because, as I understood the process, the sub task didn't really give me the functionality that I needed so INITARCH is set to OFF|NO. All it does is makes the process more convoluted. I really shouldn't need to run the sub task just to get the OPS4403O message to be displayed. That is I should still be able to change ARCHIVETRIGGER on demand and have it affect things immediately.
The OPS4403O is intended to be used with the 'old' archiving method that required the Application Parm Manager (OPSVIEW 2.A). The new archiving method starts a proc to perform the archive, thus eliminating a rule to react and then invoke a REXX to copy data and then do a submit of a batch job. The new process also eliminates the setup of the Parm Manager and global variables.
The problem with dynamic parameter values is that in some cases initialization code needs to execute to recognize the change, yes there are inconsistencies in the product where this occurs, but having the ability to restart a product component in lieu of the entire product is much more desirable and is the direction that OPS/MVS has been moving towards.
The Administrators Guide does list the fact that the ARCH subtask can be restarted to pick up new parameter values:
Note: The value of the ARCHIVEHLQ parameter is expected to be the name of a GDG. You cannot change the HLQ while CA OPS/MVS is running and must restart CA OPS/MVS to change the HLQ. However, you can restart the subtask to accommodate other new parameter values by entering the following command: F OPSS,RESTART(ARCH).
The preferred method to initiate an archive is by event number, not Time of Day, as TOD based leaves the potential for loss of data.
And my previous post listed the Administrators Guide as the location of the DASD calculation chart, this chart is in the Install Guide.
Please call me if you have any further concerns or questions on OPSLOG archiving.
Thanks for the explanation. It is all well and good that the command is referenced in the Administrator Guide but that only references the ARCHIVEHLQ, not the ARCHIVETRIGGER parameter. The documentation should probably mention ARCHIVETRIGGER as well if the archive task needs to be reset to pick up that parameter change. This should also be noted in the parameter reference as well because not all of us look at the Administrator Guide all the time. 90% of the time I am looking at the Reference Guides. The Adminstrator Guide I look at when I need to set something up.
I actually set up a TOD rule with 12.1 which executes a swap of the log and which in turn initiates the archive of the previous log. If you do the swap first there is never any worry about loosing data. This is the way the MVS System Log worked before log streams. What I would like to see is the ability to archive logs both ways, by message number and on a regular schedule (TOD). This way the log can be kept for historical purposes and breaks over a set time of day meaning that each archive runs from the same time to the same time each day. The message number then can be a safety net in case something generates a bunch of messages one day and you need to reset the log to prevent overrun. If you set up a job which when triggered would first swap the log and then initiate the archive that would be ideal. You could set a few parameters for triggers and DSN's and the job would do the rest. This could easily be triggered off a rule or some monitor. The fact that I cannot use the messages and TOD at the same time as criteria for an archive is why I have not run the sub task to do archives.
Speaking of log streams, something similar might be of interest to CA-OPS/MVS as this could eliminate the overrun issue of the current OPSLOG if implemented properly.