CA Mainframe Application Tuner

Tuning Your COBOL Program with CA MAT

  • 1.  Tuning Your COBOL Program with CA MAT

    Broadcom Employee
    Posted 01-10-2020 09:09 AM

    For a modern mainframe application developer, understanding performance peculiarities and the specifics of resource consumption of the COBOL programming language becomes a crucial base point in making your mainframe applications successful in contemporary highly competitive environment.

    To fine-tune your COBOL application for rational and effective consumption of mainframe resources, you cannot do without in-depth analysis of your program, as well as profound insight into the surrounding infrastructure. For such tweaking, a proper tool that enables instant sophisticated analysis and provides immediate improvement suggestions may be of vital importance for you as a COBOL developer.

    CA Mainframe Application Tuner (CA MAT) is an answer to the challenges faced by modern COBOL programmers. It provides professional COBOL measuring and tuning facilities based on practical developers' needs.

    As a part of default installation, CA MAT provides a demo COBOL program called TUNCOB01 that can assist you with first steps in analyzing your COBOL application and pinpointing performance issues. You can find the source in hlq.CEESSAMP(tuncob01). At Step 15 of CA MAT customization you can compile and link-edit TUNCOB01, and create the TUNIVP1 JCL to execute the TUNCOB01 program.

    Now let us examine a couple of use cases of COBOL tuning based on TUNCOB01.

     

    Understanding TUNCOB01

    TUNCOB01 is an out-of-the-box COBOL demo program delivered with CA MAT.  You can use it for the purposes of testing the COBOL analysis possibilities in CA Mainframe Application Tuner.

    The TUNCOB01 program is shipped with the following details:

    Datasets:

    • BIGBLOCK: a PS dataset with 200 bytes record and blocked by 40 records (BLKSIZE=8000)
    • UNBLOCK: a PS dataset with 200 bytes record and blocked by 1 record (BLKSIZE=200)

     

    An example of a DCB definition in the JCL: 

    DCB definition for CA MAT IVP job

     The Source example:
     

    FD block from TUNCOB01 cobol source

    In this blog, we are going to run the TUNCOB01 with these parameters, and discover the optimization opportunities using the features of CA Mainframe Application Tuner.

     

    Initial Analysis

    Let us run the COBOL program and do an initial measurement. To make our analysis more precise and comprehensive, we shall register a program code listing of the TUNCOB01 COBOL program with CA MAT.

    To measure your program, you just need to create a monitor definition for the TUNCOB01 job, run the program, and invoke the monitor while TUNCOB01 is running. CA MAT collects all the necessary data for you.

    Once the measurement is completed, you may start the analysis.

    Fig. 1. The Interactive Analysis Menu in CA MAT

                                                     Fig. 1. The Interactive Analysis Menu in CA MAT

    The analysis starts with the OverView panel where CA MAT displays the general statistics on the measurement:

    Fig. 2. The analysis OverView in CA MAT

                                                     Fig. 2. The analysis OverView in CA MAT

    The most important parameters for us here are the TCB time (01.90 s), which is actually the CPU usage time, and the Wait time (04.44 s) that denotes the delay time.

     

    Various views of the same measurement

    CA MAT allows you to examine the measurement results from a number of angles. When analyzing a COBOL program, we may be particularly interested in the following aspects:

    1. Delays caused by program code inefficiencies.
    2. Issues determined by data set properties.

    To examine each of the mentioned analysis aspects, you can make use of the specific views that CA MAT provides.

     

    Analyzing the Code

    Let's go through the program code inefficiencies first. On the CodeView screen, you can see the overview of the code analysis details:

    Fig. 3. The CodeView panel in CA MAT

                                                         Fig. 3. The CodeView panel in CA MAT

    Here we can easily spot the two main CPU consumers:

    • IGZCPAC, which is an IBM load module for COBOL,
    • TUNCOB01, which is our COBOL application

    These two COBOL program modules are the highest resource consumers, and are the most appropriate candidates for optimization.

     

    Examining Data Sets

    Addressing disk space while executing a program is also a potential performance bottleneck, and this is another aspect for us to examine. To analyze the measurement results from the data set prospective, we shall use the DataView that CA MAT provides.

    Fig. 4.  The DataView panel – data set analysis in CA MAT

                                                    Fig. 4.  The DataView panel – data set analysis in CA MAT

    Here we witness a more than 9% delay for the unblocked dataset (UNBLOCK). Quite a difference, as compared to the blocked one (BIGBLOCK)!

    The unblocked dataset is our target for improvement here.

     

    Improving the COBOL program

    Now that we have identified the key sources of the most significant delays, let us find out the causes, and improve our application accordingly.

     

    Optimizing the COBOL Program Code

    As you remember, the results of our initial analysis uncovered 2 modules that caused the highest delays: the COBOL system load module IGZCPAC, and our TUNCOB01 program. To find a solution for optimization, we need to dig deeper for the root cause of performance inefficiencies here.

    Let's analyze the TUNCOB01 program first.

    For deeper analysis, we shall examine the Histogram view for the TUNCOB01 program:

                                                  Fig. 5. The Histogram view for a COBOL program in CA MAT

    The hotspot here is the verb ADD – statement number 141. To find out why, we need to examine the program listing that we registered with CA MAT for this measurement. What we need is to navigate to the ADD statement number 141 (the L command) and see the listing details:

                                                  Fig. 6. The COBOL program listing analysis

    What we are witnessing here is the inefficient use of Subscripts:

    ADD RECORD-00 (SUB) TO RECORDB-00 (SUB)

    We need to go further for a deeper analysis and view the assembler code for this statement:

                                           Fig. 7. The assembler code view for a statement in CA MAT

     

    We can see the PACK, CVB, UNPACK being used for the ADD statement. Can we optimize the code? Sure!

    The definition for SUB is  SUB   PICTURE S9(5), which is actually the DISPLAY that needs conversion between binary and EBCDIC.

                                          Fig. 8. Analyzing definitions in CA MAT

    The performance suggestion here would be to modify the code as follows:

          SUB    PICTURE S9(5) comp.

     

    Now let's see what we can do with the IGZCPAC module.

    For deeper analysis, we shall examine the Histogram view for IGZCPAC module, and from here identify the Caller ID for this system module. That will allow us to see the parts of the application (Callers) that actually call the module during the measurement:

                                             Fig. 9. The Caller ID panel displays the program parts that call the module

    Since we registered the program listing before the analysis, we can see here that IGZCPAC was called from 2 places, the most alerting being the offset 000A64 which is mapped to statement number 161. If we examine the program listing for this particular statement, we can find out that it is the INSPECT statement. Again, let us examine the assembler code for this statement:

                                            Fig. 10. The assembler code for the INSPECT statement displayed in CA MAT

    We might also spot that the INSPECT statement caused some inefficiency from the Histogram view when analyzing the TUNCOB01 program (see Fig. 5).

    So, why does the use of the INSPECT statement cause application delays?

    When a COBOL application program performs a dynamic call to a second program, the control is first passed from the caller to CEE module IGZCPAC (or IGZXLPKA in later releases of COBOL). These programs then pass the control to the called application program once they complete their processing. The INSPECT module here is IGZCIN1. So, IGZCPAC is called first before the control is passed to IGZCIN1. This process is quite complicated, and the use of the INSPECT statement here may cause delays.

    Therefore, for this particular case the most appropriate performance suggestion would be to avoid using the INSPECT statement.

     

    Adjusting the Data Set Parameters

    Our initial analysis revealed a more than 9% delay for the unblocked dataset, which was about 10 times higher than for the blocked one. What is the reason for such a gap? To answer this question, we need to perform a detailed analysis of the data sets.

    From the DataView panel, go to the Data set details view (the S command). Here CA MAT presents granular statistics for the chosen data set:

                                                         Fig. 11. Data set analysis details in CA MAT

    There are 2 parameters on this view that should catch our attention: the Total EXCPs, and the active rate (the average number of I/Os per second performed against the data set during the monitored period). Put simpler, our COBOL application wastes much time waiting for the data set to respond. The measured values for these parameters are pretty high, this is what contributes significantly to the overall application delay, and we want to reduce it.

    How can we do that? As we can see, the Block Size parameter for this unblocked data set is 200. For the other data set, which causes far less delays (BIGBLOCK), the Block Size is set to 8000:

                                                                                Fig. 12. Partially Optimized data set details

    As we remember from the source, we did not set any Block Size limitation for the UNBLOCK data set, whereas for the BIGBLOCK data set, the block contains 40 records. The IBM documentation for Enterprise COBOL for z/OS states that in a COBOL program you can establish the size of a physical record with the BLOCK CONTAINS clause. If you do not use this clause, the compiler assumes that the records are NOT blocked. Blocking QSAM files on disk can enhance processing speed and minimize storage requirements. The differences in performance are quite significant:

     

    SPACE (cyls)

    EXCPS
    (number of I/Os)

    # Records

    CLOCK TIME (minutes)

    Unblocked (LRECL=43)

    2588

    4142000

    1107944

    149.2

    Blocked at half-track (standard)

    183

    1061000

    1250200

    73.3

    Savings

    93% reduction

    74% reduction

    13% increase

    51% reduction

     

    Based on the above mentioned, the performance suggestion here would be as follows: add the BLOCK CONTAINS 0 RECORDS clause to the UNBLOCK data set definition. That would considerably reduce the response time due to more active use of memory instead of disk facility.

     

    Enjoy the Results

    Let us summarize what we have done to optimize our COBOL program and eliminate the initially discovered inefficiencies:

    1. Data set configuration: the BLOCK CONTAINS 0 RECORDS clause was added to the UNBLOCK data set definition, to encourage active memory utilization.
    2. COBOL program optimization: the definition for SUB was updated (SUB PICTURE S9(5) comp.) for more efficient processing by the ADD verb.
    3. The performance-killing statement INSPECT discovered which called the IGZCPAC load module from the COBOL program inefficiently, thus contributing to the overall delay.

    Now that we have implemented all the performance suggestions, let us re-measure the optimized COBOL program, and watch the performance gain.

    The overall results show that both the TCB and Wait times have significantly improved:

                                   Fig. 13. The analysis OverView after fine-tuning with CA MAT

    The TCB time has reduced to 00.52 s (vs. the initial 01.90 s), and the Wait time has lowered to 01.36, as compared to the initial 04.44 s.

    Similarly, the other key measurement parameters follow. The Histogram view for the TUNCOB01  program displays a better performance, and we can see that the initial ADD verb at statement number 141 is not the highest resource consumer any longer:

                                      Fig. 14. Performance improvements of the COBOL program

    The same can be said about the data sets. The DataView shows that after reconfiguring the UNBLOCK data set, its contribution to the overall delay has become as low as 0%:

                                                     Fig. 15. Data set performance improvements

    If we check the details of this data set analysis, we can see that the block size has been dynamically determined by the system as Half-track (standard), which produced a considerable positive impact on the overall performance:

                                                     Fig. 16. Performance details of the optimized data set

    These easy and logical optimization steps have allowed us to tune our COBOL program, and more than 3 time improve the overall performance. The further performance suggestions here might include recommendations to use Index instead of Subscript in the COBOL program code, as Index generally ensures better performance for COBOL statements.

    For more information on COBOL performance tips, refer to numerous external resources, for example, here or here.

    These best practices may become a solid starting point for you as a COBOL developer in optimizing your own applications with the help of CA MAT. More information on how CA Mainframe Application Tuner works you can find in CA MAT documentation.

    Stay tuned!