Call to Action – Retiring non-URI Data Areas in Datacom
By Kevin Shuma May 2024
As part of our Datacom “Beyond Code” outreach in 2023 and 2024, I have been participating in a number of Environment Reviews with our Broadcom Datacom customer base. In the environment review process, we look at both the physical implementation based on the Datacom Directory (CXX) as well as the database processing in the Multi-User Facility (MUF).
During these physical reviews, I have come across a number of sites that still have data areas that are still using non-URI record addressing for some of their data areas. Non-URI addressing uses physical location addressing to identify records (rows) within the Datacom data area. While this was sufficient in the early releases of Datacom, it has certain limitations that affect the ability to store data rows and in certain circumstances can cause data integrity issues during restart. In fact, I have recently been involved with a site that has experienced a data recovery failure due to non-URI datasets.
Non-URI Data Areas have been a documented Restart/Recovery risk for over 20 years. Sites still using non-URI data areas risk possible data loss and data integrity issues.
Non-URI data areas are also prevented for participating in core Datacom capabilities such as Online Reorganization, Online Area Move and VSAM Linear Datasets (LDS).
Sites still employing non-URI data areas should begin the process of replacing these data areas with the standard URI data areas as soon as possible.
This document provides information on the use of non-URI data areas and the migration path to URI data areas. If you have questions or would like guidance from the Datacom team, please reach out to Datacom Technical Support or the Datacom Product Management team.
We are here to help!
How can I tell if my site has non-URI data area(s)?
For sites that have Dynamic Systems Tables (DSTs) and SQL activated, a simple SQL query can be executed to locate any non-URI data areas.
SELECT
DBID, AREA_NAME, URI
FROM SYSADM.DIR_AREA
WHERE URI = 'N' ;
Sites with SYSVIEW and DSTs can use the SYSVIEW DCLIST command to list all MUFs on a given LPAR. Select a specific MUF and issue the DCAREA command, the URI setting is to the right of the display. Any data area with blanks under the column header URI is a non-URI data area.
Sites without DSTs or SQL mode enabled, can get the same information by running a full CXX report and scanning for the string ‘UNIQUE ROW IDENTIFIERS (URI) ‘. Data areas with URI format will have a numeric value following the string, non-URI data areas will have the word “NONE” printed.
If your site does not have any non-URI data areas, you can skip the rest of this article.
What is the difference between URI and non-URI data areas, and why is it important?
There are two choices for physical record addressing that is selected during the DBUTLTY LOAD of the data area:
· URI=YES (default and highly recommended) or
· URI=NO (deprecated addressing method – may be needed to support specialized legacy application use)
Physical Record Addressing (RECID) is an integral part of the Datacom index
When the logical index or key is created by the user, they select one or more data columns as a pathway to access data rows. The combination of the data columns is called a key value.
For example, the EMPLOYEE table could have an index built over employee number, a different index built over last name and another index built over the employee address.
For each data row, there would be an entry created in the Datacom index using the key value and the physical address of the data row. This physical record address is called the RECID. In Datacom, unique indexes will have a 1-to-1 relationship between the key-value and the RECID. In non-unique indexes, we have a 1-to-many relationship where we store the key-value once and then attach the RECID for each of the rows with that key-value.
Before URI
Prior to Release 8.0 (1990), RECIDs were based on the physical location of the data row within the data block within the data area (dataset). The RECID was constructed from the database ID, table ID, relative block number and record number within the block. Maximum records per block was 255.
Release 8.0 implements URI
With the implementation of SQL in release 8.0, it became necessary to expand the record addressing capability within Datacom to ensure that every data row had a Unique Record ID or URI. The implementation of URI expanded the length of the physical record address by 4 bytes establishing the ability to uniquely identify every row in the data area. The number of allowable URIs for a given data area is 4 billion.
URI provides additional capabilities for Release 8.0 and above
In addition to providing the unique record identification required by SQL, it also resolved two additional issues that were present in the database.
The first was that non-URI addressing limited the number of data rows per block to 255. For some shops with small row sizes or small compressed row sizes, this limit reduced the value of moving to a data area with a larger block size for processing efficiency. With URI addressing the 255 limit is removed.
The second was that it removed a known issue with data base recovery when a data row was deleted and a new row was added at the same location within the data block. If the database region failed, the ensuing restart of the database region might have encountered a recovery error because log records were encountered for two different data rows with the same RECID. For most shops at that time, the volume and size of processing did not generate the conditions necessary for this recovery error to occur very often. However, as processing speeds increased and data row volumes grew, the chance of encountering a recovery error became more possible. With the introduction of the URI-based RECID, this recovery issue was solved.
Over the next few releases (8.0 to 10.0), all of the Datacom system databases were converted to URI data areas. Customers were urged to convert their user database data areas to URI format as soon as possible.
How is the URI setting determined for a given data area
While the URI setting is maintained in the CXX, it is not part of the Datadictionary metadata model.
The URI setting is determined during the DBUTLTY LOAD data area process. The default setting is URI=YES.
To continue to use non-URI data areas, the DBUTLTY LOAD for the data area load must be coded with URI=NO parameter. If a data area is loaded without the URI=NO parameter, it will be loaded as a URI=YES data area.
Due to the implementation of URI as the default for data area loads and a requirement for database level loads, most of the user database data areas have been converted to URI format. The databases that remain in non-URI format have either not been loaded in the last 20 years or have been purposely set to non-URI (URI=NO) by the user during data area load processing.
Pros and Cons of URI versus non-URI
The “PROs” of going to data areas with URI addressing are numerous; here are just a few of the key reasons:
· Full support of SQL
· Full support for recovery/restart
· Removes the limit of 255 rows for large data blocks
· Enhanced processing for data areas with compressed tables
· Set of DBUTLTY “database level” functions only available as URI
· A number of DBUTLTY sub-functions such as LOADID=NO and REMAP= are only supported as URI
· Full support for “moved records index” which enables many 24x7 enhancements
o Online Reorg
o Online Data Area Compression
o Online Area Move
o Online Block Size Alter
o Online Table Move
· VSAM Linear Dataset implementation (requires URI)
The “CONs” of going to URI for data areas are limited to three concerns:
· Minor concern over increased index sizes with the expanded record address stored in the IXX.
o The graphic bellow shows a minimum index growth using one of our large benchmark databases. There will be a minor increase in index space use with URI.
o A very small growth in data area blocks can occur if the data area has compressed tables. This is because with URI data areas have a few extra bytes stored in each block to speed processing of compressed blocks. The feature is not available for non-URI data areas. Non-compressed data area sizes will not change with the migration to URI.
o Sample table below:
· Minor concern that a database access outage must occur to back up the database/data area, initialize the data area(s) and then reload the data areas using DBUTLTY with URI=YES.
· Medium concern for shops that may have legacy applications that “store” the RECID for later use by two specialized RAAT commands (REDID and RDUID). These types of applications were typically written in the early 1980s when memory sizes were limited and (index) buffer pools were limited in size. In my 35+ years with Datacom I have only encountered a few sites that have user written application programs with REDID/RDUID.
Medium (application) concern when converting from non-URI to URI
As we discussed above, there is a minor space increase concern and a database access outage that is needed to backup, initialize and load the database/data areas into URI format (URI=YES). These are mainly operational concerns that just need to be scheduled.
The main concern with the conversion to URI is that the sites could have legacy application programs using the record-at-a-time (RAAT) commands REDID or RDUID. These two commands allow data rows to be accessed directly by their record address (RECID). These commands allow the programmer to load the physical record address (RECID) into the request area and “read” the data row directly without using the table’s index(es). Similarly, the RDUID command will “read for update” data rows directly without using the table’s index(es). The REDID/RDUID commands are very specialized and for most shops were probably never used.
However, a few shops may have still legacy application programs with these commands.
How are REDID/RDUID commands used?
Typically, a user has a table with a set of data rows that does not change very often. To speed up processing the programmer decides to pass through the data rows in the table. For each row returned, the program stores in its memory the column(s) that identify the row and the row’s physical record address (RECID).
For example, a stock broker application can read all rows in the stock table using one of the standard RAAT retrieval commands (like REDKG/REDNX). For each row returned the application stores in its memory an array with the row’s value for the column “ticker-symbol” and the row’s physical record address (RECID). The row’s RECID is found by saving a set of bytes from the Datacom request area that is updated when the request area is returned as part of the data row retrieval (see graphic below).
Once all rows are processed by the first sequential pass, the program now has an in-core array that allows the programmer to read a data row without using the index. Each time a new request comes in with a ticker-symbol, the program can look-up the matching ticker-symbol in the memory array and its RECID value. Then using the RECID values, a Datacom request area can be built and the data row retrieved using the REDID or RDUID commands.
The program would not function very well if the underlying Datacom table had row changes (adds, deletes, or updates) that moved rows during its execution since the in memory array only has the initial physical record addresses.
With the high-speed index processing available in today’s Datacom and the availability of large index buffer pools, this type of special direct read processing is no longer needed. Several years ago, I worked with a site where we replaced the REDID/RDUID processing with standard REDKx and RDUKx commands and saw no affect to performance of the application. An extra benefit was that the occasional error that occurred due to a stored RECID no longer being accurate went away.
What happens if my site is using REDID/RDUID commands against a table in a non-URI data area and we convert the data area to URI
If the program stored the command section the Datacom at request area (bytes 1-76), then the change to URI will not be a problem. This is because the command section (bytes 1-76) would have all the information needed to issue the REDID/RDUID request for the table for either non-URI or URI data area.
If the program stored in memory only the 9 bytes of the request area (bytes 17-25) that make up the non-URI RECID, it will work for a table stored in a non-URI data area. However, it will not have all the information that is needed to issue a REDID/RDUID command for a table stored in a URI data area. In this case, the REDID/RDUID calls will most likely fail with a Datacom return code of “94(086) – Invalid Datacom RECID”.
We can see by comparing the RECID formats in a non-URI and URI data area, that to successfully issue the REDID/RDUID command, the program must save and restore request area bytes 17-25 and bytes 61-64. Increasing the memory needed to store the physical record address (RECID) from 9 bytes per row to 13 bytes per row.
So in summary, if the REDID/RDUID program is storing in memory the command section of the request area (bytes 1-76), there would be no changes needed. If the program stores the 13 bytes needed for URI it would also work for non-URI and no change is needed.
Remediation of an application program that is issuing REDID/RDUID without saving all the request area bytes
If the program only stores the non-URI RECID bytes of the request area (bytes 17-25), then remediation of the application program is required before converting the data area to URI format.
The program remediation would include:
· Locating the area in memory (working storage) where the RECID are being stored. This would most likely be a memory array of 13-byte memory elements.
· Expanding the memory area to store the larger URI RECID of 17 bytes.
· Locate in the program where the RECID value (bytes 17-25 of the Datacom request area) is being stored in the memory element and change the code to store Datacom request area
o Bytes 17-25 to the first 9 bytes of the memory element and
o Bytes 61-64 to the last 4 bytes of the memory element.
· Locate in the program where the RECID value is being restored into the Datacom request area just prior to issuing the REDID/REDUID command(s). Change the code to restore the additional memory element bytes into the Datacom request area
o Load Bytes 17-25 of the request area with the first 9 bytes of the memory element and
o Load Bytes 61-64 of the request area with last 4 bytes of the memory element.
· Compile and test the program with a non-URI and URI formatted versions of the data area.
Note: The tested application program (with the remediation) can be promoted to the production system while the table’s data area is still in non-URI format. Adding the functionality to save and restore the extra 4 bytes needed for URI data areas will not affect the non-URI processing.
How do I determine if my site is using REDID/RDUID processing in our application programs
Source language scan
For most sites, a quick scan of the existing application source code libraries would be a good place to start.
A simple scan of the application programming source libraries for the strings “REDID” and “RDUID” should find most programs that use these Datacom RAAT commands. This should not turn up too many false positives as these character strings are typically only used in Datacom RAAT programming.
Using Datacom Accounting Facility to locate REDID and RDUID application programs
Unfortunately, REDID and RDUID RAAT commands do not require any specific URT attributes and there is no tracking of RAAT command use in normal Datacom statistics and reports (Dynamic Systems Tables, PXX reports, MUF EOJ reports, etc.).
The Datacom Accounting Facility can be used to trap the use of REDID/RDUID commands. When activated, the Accounting Facility has access to many attributes for each program executing against the Datacom Multi-User Facility (MUF). These attributes can be collected and stored in one of the accounting tables (DBID 005). One of the application program attributes that can be collected is the use of the REDID and RDUID commands.
The Accounting facility allows the user to design and implement up to 100 different accounting table definitions. Each definition can be tailored on what data to collect and when to collect it.
A special Datacom Accounting table definition can be created to trap which programs (batch or online) are using REDID/RDUID commands. Once the data is collected, it can be ordered by database and data area order. Only those programs that use REDID/RDUID commands against a non-URI data area will need to be remediated as described above.
The only drawback to using the accounting facility is that the collection process adds some overhead to MUF and needs to be active during the majority of MUF processing to make sure all programs using REDID/RDUID are captured. This could mean for some shops that the accounting facility needs to be active for several months to collect month-end and quarter-end processing. Since the main reason programmers used REDID/RDUID commands was to support very active systems like the stockbroker application described earlier, a few months of data should indicate if any REDID or RDUID processing in in use.
See Appendix A for a sample accounting definition that can be used to collect information on application programs that are issuing REDID or RDUID commands.
See Appendix B for a sample report generated from the data collected in the sample accounting table.
Process to implement URI for non-URI data areas
The user may choose to implement URI on a data area by data area basis. For ease of use, we would recommend sites convert to URI on a DBID level. Existing Datacom DBUTLTY functions provide a very quick way to convert all data areas in a database to URI at one time. The steps below would be utilized for each DBID that is being converted to URI.
1. Use source code scans and/or accounting facility to locate and remediate any REDID or RDUID programs using the information provided in the section above.
2. Determine if there is enough available space in the index and data areas for the minimal URI growth.
· IXX make sure you have at least 5% free space
· Compressed data areas make sure you have at least 3% free space
3. Schedule a time where the DBID can be taken offline.
3. Conversion to URI:
· Close the database to all users
· Use ACCESS=UTLTY to block any user access
· Backup the database using SEQUENCE=PHYSICAL (for emergency and restore purposes)
· Use DBUTLTY to SPILL the current LXX to the RXX
o Note the date/time and generation of the RXX spill tape (this is the last one in non-URI format)
· If needed, increase the dataset allocations and initialize the space
· Load the database backup using the URI=YES parameter
o All data areas will now be URI format
· Any forward recoveries should utilize this backup plus the RXX tapes since the backup was taken
Special note: If for some reason (such as a critical REDID application was not remediated) and the database must be restored back to the non-URI format, stopping the database and use the same process above to backup and restore the database to non-URI format. In this case, the individual data areas must be loaded separately from the backup tape using DBUTLTY LOAD AREA=aaa with the URI=NO parameter.
If the database backup taken during the conversion process is damaged and the database must be recovered using an earlier backup. The user will need to go through a multi-step recovery to complete the forward recovery.
· Restore to the earlier backup tape using the DBUTLTY LOAD AREA=aaa with the URI=NO parameter (matches the format of the earlier RXX tapes).
· Run forward recovery using the RXX tapes since the non-URI backup date/time up to the last RXX tape created at the time of the conversion to URI.
· Perform the conversion to URI process again.
· Run forward recovery using the RXX tapes created since the data was converted to URI. Remember that you may need to spill the current LXX to RXX to get all of the log records.
· At this point, the database is up to current time in URI format. A decision will need to be made as to whether it should be left in URI format or a backup and restore done to place the data areas back in non-URI format.
Appendix A. Sample accounting table definition to trap use of REDID and RDUID commands.
This definition adds a new A15 accounting area/table for collection of REDID/RDUID application executions. For sites with A15 already defined just change the “A15” to an available accounting table definition. A text file of the transactions is available. if needed please open a support case and request the A15 accounting definitions.
-BTG AREA,ACT-A15(T001)
1000 CONNECT,SAMP-ACT-DB
1001 'DATACOM-INSTALL ',
- 'DATACOM-INSTALL '
1002 'FIND PROGRAMS USING REDID OR RDUID '
1200 TEXT CLS STANDARD
1200 TEXT ADD
PROGRAMS THAT ARE SENSITIVE TO NON-URI - URI CONVERSION
1200 TEXT END
3001 A15
3002 Y 3390 04096 DATACOM 1 0000 N
3004 00000 ACT_A15 Y 00010
-END
-BTG TABLE,ACT-A15(T001)
1000 CONNECT,ACT-A15
1001 'DATACOM-INSTALL ',
- 'DATACOM-INSTALL '
1002 'FIND PROGRAMS USING REDID OR RDUID '
1200 TEXT CLS STANDARD
1200 TEXT ADD
FIND PROGRAMS USING REDID OR RDUID
/(COMND='REDID') OR (COMND='RDUID')/
1200 TEXT END
3150 A15 017 N N Y N Y DATACOM Y 000 000 N N
3151 00000 N 00 000
3152 Y SYSUSR ACT_A15
-END
-GRP START,TABLE,ACT-A15(T001)
-BTG FIELD,A15KY
1001 'DATACOM-INSTALL ',
- 'DATACOM-INSTALL '
4010 START START
4012 C C L N 00000 00 00001 N C N N N N N N 00000
4016 N
4017 A N
4020 A15KY
-END
-BTG FIELD,JNAME
1001 'DATACOM-INSTALL ',
- 'DATACOM-INSTALL '
4010 A15KY A15KY
4012 S C L N 00008 00 00001 N C N N N N N N 00008
4016 N
4017 A N
4020 JNAME
-END
-BTG FIELD,SNAME
1001 'DATACOM-INSTALL ',
- 'DATACOM-INSTALL '
4010 JNAME A15KY
4012 S C L N 00008 00 00001 N C N N N N N N 00008
4016 N
4017 A N
4020 SNAME
-END
-BTG FIELD,TRANS
1001 'DATACOM-INSTALL ',
- 'DATACOM-INSTALL '
4010 SNAME A15KY
4012 S C L N 00004 00 00001 N C N N N N N N 00004
4016 N
4017 A N
4020 TRANS
-END
-BTG FIELD,PNAME
1001 'DATACOM-INSTALL ',
- 'DATACOM-INSTALL '
4010 TRANS A15KY
4012 S C L N 00008 00 00001 N C N N N N N N 00008
4016 N
4017 A N
4020 PNAME
-END
-BTG FIELD,UID01
1001 'DATACOM-INSTALL ',
- 'DATACOM-INSTALL '
4010 PNAME A15KY
4012 S C L N 00032 00 00001 N C N N N N N N 00032
4016 N
4017 A N
4020 UID01
-END
-BTG FIELD,BASE
1001 'DATACOM-INSTALL ',
- 'DATACOM-INSTALL '
4010 UID01 A15KY
4012 S B R N 00002 00 00001 N C N N N N N N 00002
4016 N
4017 A N
4018 SMALLINT
4020 BASE
-END
-BTG FIELD,TNAME
1001 'DATACOM-INSTALL ',
- 'DATACOM-INSTALL '
4010 BASE A15KY
4012 S C L N 00003 00 00001 N C N N N N N N 00003
4016 N
4017 A N
4020 TNAME
-END
-BTG FIELD,COMND
1001 'DATACOM-INSTALL ',
- 'DATACOM-INSTALL '
4010 TNAME A15KY
4012 S C L N 00005 00 00001 N C N N N N N N 00005
4016 N
4017 A N
4020 COMND
-END
-BTG FIELD,A15-DATA
1001 'DATACOM-INSTALL ',
- 'DATACOM-INSTALL '
4010 A15KY START
4012 C C L N 00000 00 00001 N C N N N N N N 00000
4016 N
4017 A N
4020 A15_DATA
-END
-BTG FIELD,REQS
1001 'DATACOM-INSTALL ',
- 'DATACOM-INSTALL '
4010 A15-DATA A15-DATA
4012 S B R Y 00004 00 00001 N C N N N N N N 00004
4016 N
4017 A N
4018 INTEGER
4020 REQS
-END
-GRP END
-BTG KEY,ACT-A15.
- A15KY(T001)
1001 'DATACOM-INSTALL ',
- 'DATACOM-INSTALL '
1002 'KEY FIELDS '
5000 A15KY 017 Y Y 000 I IXX Y
5005 A
5010 ADD JNAME 00
5011 $FIRST
5012 A N
5010 ADD SNAME 00
5011 JNAME
5012 A N
5010 ADD TRANS 00
5011 SNAME
5012 A N
5010 ADD PNAME 00
5011 TRANS
5012 A N
5010 ADD UID01 00
5011 PNAME
5012 A N
5010 ADD BASE 00
5011 UID01
5012 A N
5010 ADD TNAME 00
5011 BASE
5012 A N
5010 ADD COMND 00
5011 TNAME
5012 A N
5100 N A15KY_A1500005 N
-END
-BTG ELEMENT,ACT-A15.
- BASE(T001)
1001 'DATACOM-INSTALL ',
- 'DATACOM-INSTALL '
6000 BASE 00
6010 ADD BASE
-END
-BTG ELEMENT,ACT-A15.
- COMND(T001)
1001 'DATACOM-INSTALL ',
- 'DATACOM-INSTALL '
6000 COMND 00
6010 ADD COMND
-END
-BTG ELEMENT,ACT-A15.
- JNAME(T001)
1001 'DATACOM-INSTALL ',
- 'DATACOM-INSTALL '
6000 JNAME 00
6010 ADD JNAME
-END
-BTG ELEMENT,ACT-A15.
- PNAME(T001)
1001 'DATACOM-INSTALL ',
- 'DATACOM-INSTALL '
6000 PNAME 00
6010 ADD PNAME
-END
-BTG ELEMENT,ACT-A15.
- REQS(T001)
1001 'DATACOM-INSTALL ',
- 'DATACOM-INSTALL '
6000 REQS 00
6010 ADD REQS
-END
-BTG ELEMENT,ACT-A15.
- SNAME(T001)
1001 'DATACOM-INSTALL ',
- 'DATACOM-INSTALL '
6000 SNAME 00
6010 ADD SNAME
-END
-BTG ELEMENT,ACT-A15.
- TNAME(T001)
1001 'DATACOM-INSTALL ',
- 'DATACOM-INSTALL '
6000 TNAME 00
6010 ADD TNAME
-END
-BTG ELEMENT,ACT-A15.
- TRANS(T001)
1001 'DATACOM-INSTALL ',
- 'DATACOM-INSTALL '
6000 TRANS 00
6010 ADD TRANS
-END
-BTG ELEMENT,ACT-A15.
- UID01(T001)
1001 'DATACOM-INSTALL ',
- 'DATACOM-INSTALL '
6000 UID01 00
6010 ADD UID01
-END
Appendix B. Sample DBSQLPR report of data in accounting table A15 definition used to trap use of REDID and RDUID commands.
INPUT STATEMENT:
SELECT
*
FROM SYSUSR.ACT_A15 ;
JNAME SNAME TRANS PNAME UID01 BASE TNAME COMND REQS
________ ________ _____ ________ ________________________________ _____ _____ _____ ___________
RDUIDMFX RDUIDMFO RDUIDMFO 1020 MFO RDUID 11120
RDUIDMF2 RDUIDMFO RDUIDMFO 1020 MFO RDUID 11120
RDUIDMF3 RDUIDMF3 RDUIDMFO 1020 MFO RDUID 11120
REDIDMFX REDIDMFA REDIDMFA 1020 MFA REDID 3431632
REDIDMFX REDIDMFX REDIDMFA 1020 MFA REDID 3431428
REDIDMF2 REDIDMFX REDIDMFA 1020 MFA REDID 3431632
___ 6 rows returned ___
JNAME - Batch Job Name
SNAME - Batch Step Name
TRANS - CICS Transaction ID if this was a CICS task
PNAME - CICS Program name (PPT) if this was a CICS task
UID01 - User ID Block from request area (most sites store batch program name here)
BASE - Database ID
TNAME - 3-Character Table Name
COMND - Datacom command (should be REDID or RDUID)
REQS - Number of times this command was issued (by all executions while table open)