Db2 Tools

To PDASTATS or to RUNSTATS - That Is The Question

  • 1.  To PDASTATS or to RUNSTATS - That Is The Question

    Posted 12-20-2019 09:11 AM
    Edited by Himanshu Gupta 12-27-2019 04:10 AM

    Article written by Steen Rasmussen – Steen.Rasmussen@broadcom.com

    Recently we have seen several questions related to PDASTATS and updating the Db2 Catalog, so hopefully this article will remove any confusion.

    Historical background


    Let's take a step back and look at the history in order to get all the pieces in the puzzle together.
    Appropriate and accurate catalog statistics are vital in order for the Db2 Optimizer to select the optimal access path. Runstats is the IBM utility to collect these statistics and update the catalog, but there were serious issues. The only way to identify which objects had a bad clusterratio, leafdist, relocated rows etc. was to execute Runstats. The issue was - executing Runstats prior to a reorg in order to identify reorg candidates updated the catalog, meaning subsequent BIND, REBIND or Dynamic SQL executions could choose a different (worse) access path. Later versions of Runstats offered the ability to execute Runstats using REPORT ONLY so the catalog didn't get updated with non-optimal statistics for the Optimizer. Starting with Db2 V7, RTS became an option to identify reorg candidates and is now recommended best practice as the "trigger" to identify reorg candidates.  

     

    This was one of the reasons why PDASTATS was invented - having PDA doing very detailed analysis of the pagesets and inserting these statistics into the associated PDA-tables via Extract Procedures so the associated Action Procedures could generate the needed reorg's etc. 

     

    Runstats was also quite expensive to execute in the early days of Db2, so many customers also used PDA to collect new statistics after the reorg and used PDA to update the catalog. This both saved CPU and eliminated the Runstats issue of getting "bad statistics" into the catalog which potentially could impact the Optimizer's access path selection in a negative fashion. 


    Today's reality

    For years CA recommended completely replacing IBM Runstats with PDSTATS, but this really changed when Db2 10 became available, so let's cover the details:

    1. Starting from Db2 10, the Db2 Optimizer introduced new metrics (such as DataRepeatFactor and Histogram statistics). IBM has not published the algorithms for these. Even if we decided to reverse engineer them, IBM could potentially change them at any time via a PTF and the resulting catalog update would cause the Optimizer to choose a less effective access path. All of us know that the customer would blame the CA/Broadcom solution. Furthermore, IBM is periodically changing the algorithm for ClusterRatio with no warning - making it even more dangerous for PDA to attempt to "replace" Runstats.
    2. In Db2 11 RUNSTATS PROFILES were introduced along with SYSSTATFEEDBACK - and this has been refined in Db2 12, so using RUNSTATS could even be considered to be autonomic computing - completely eliminating the need to spend time figuring out how to create the optimal Statistics required for an object.
    3. Looking at PDASTATS specifically, there are two very distinct features. One is the ability to COLLECT very detailed statistics by reading the VSAM pagesets. There are many reasons for this exercise: The PDA collected metrics are more detailed than those Runstats provides and many customers still generate reports based on these enhanced PDA collected metrics. The other feature is the ability to have PDA update the catalog (as opposed to Runstats). This is where the recommendation from CA changed starting with Db2 10 - because we do NOT want to cause the Optimizer choosing a less optimal access path.
    4. We already mentioned one reason why customers used to let PDA update the catalog was due to CPU savings. This argument is pretty much gone since Runstats is now close to being 100% zIIP eligible.
    5. In order not to have the customers jeopardize performance regression, a few years ago we provided a PDA PARMLIB parameter (DISABLE_DB2_CAT_UPDATE) to prevent PDA from updating the catalog by default. This parameter can be customized but it isn't recommended to do so except for very specific scenarios - one being a recent BMC conversion where the customer was on Db2 9, and decided to retain the BMCSTATS solution using PDA instead updating the catalog until they upgraded to Db2 10. 

    DISABLE_DB2_CAT_UPDATE (PDA,RC4)    /* Stop updating the DB2 cat-  */

                                        /* alog as that feature has    */

                                        /* been deprecated over time.  */

                                        /* IBM RUNSTATS generates      */

                                        /* additional statistics that  */

                                        /* the optimizer can utilize.  */

                                        /* Allows for site level con-  */

                                        /* trol to disable and opt-    */

                                        /* ionally set a step RC.      */

                                        /* Can be used to identify     */

                                        /* job steps that need to be   */

                                        /* reviewed and/or updated.    */

                                        /* Keyword values:             */

                                        /*  (NO) = do not disable      */

                                        /* or format (<funct>,<action>)*/

                                        /*  (fct,MSG) = display message*/

                                        /*  (fct,RC4) = step RC 4      */

                                        /*  (fct,RC10)= step RC 10     */

                                        /*   fct can be any of these 3:*/

                                        /*    PDA means PDA Extracts,  */

                                        /*        PS Utility Code,     */

                                        /*        UC Utility Code, or  */

                                        /*        PDASTATS utility     */

                                        /*    CAU means CA Utility     */

                                        /*        (reorg or load)      */

                                        /*    ALL means PDA or CAU     */

                                        /*        listed functions     */

                                        /*  (PDA,RC4) is the DEFAULT   */

      

    Conclusion


    The PDASTATS component is not the biggest selling point of Database Analyzer. PDA's strength using either PDASTATS or RTS to automate REORG generation or any other Db2 housekeeping processes is the real power. The models are highly customizable with extensive IF-ELSE capabilities. Also, once you get a good grip of the capabilities within Action Procedures, you can automate generating almost anything, but this is a topic for another DIFU where I will cover some scenarios like generating a RC/Migrator strategy for every database in a Db2 subsystem among other use cases.

     

    Contacts for additional information on this Update

    Steen Rasmussen (Services Architect) : Steen.Rasmussen@broadcom.com

    Jan Marek (Product Manager) : Jan.Marek@broadcom.com

    Himanshu Gupta (Lead Developer) : himanshu-hg.gupta@broadcom.com