Test Data Manager

 View Only
Expand all | Collapse all

Oracle HDL format data masking

  • 1.  Oracle HDL format data masking

    Posted Jul 08, 2020 11:08 AM
    I'm looking into the best way of masking data for testing of a migration from legacy on premise Oracle EBS apps to Oracle HCM in the cloud. One option is to mask the HDL format files which are produced by the Extract and Transform process, prior to loading into Oracle HCM. The files can contain multiple record types, prefaced by a metadata row. Obviously, I don't want to unintentionally end up masking the metadata row.

    I know how to use multiple FDM layout files, to define multiple record types within one file. But only to identify one field as the RECTYPE and in this case, it would have to be a composite of eg MERGE and WORKER, to avoid masking the metadata. I've also seen on separate thread on Conditional File Masking, using a WHEREASSUBSET, which could perhaps be used in combination with a RECTYPE of MERGE. Not sure if it's the most elegant and efficient way though:

    Example HDL file:

    METADATA|Worker|SourceSystemOwner|SourceSystemId|PersonNumber|EffectiveStartDate|EffectiveEndDate|BloodType|CorrespondenceLanguage|StartDate|DateOfBirth|DateOfDeath|CountryOfBirth|RegionOfBirth|TownOfBirth|ActionCode|PersonDuplicateCheck
    MERGE|Worker|EBS_UK|PER_PERSON_100|100|1993/07/03|4712/12/31|||1993/07/03|1959/05/03|||||HIRE|ORA_LN_FI_DOB_NID
    MERGE|Worker|EBS_UK|PER_PERSON_105|105|1996/01/08|4712/12/31|||1996/01/08|1943/10/29|||||HIRE|ORA_LN_FI_DOB_NID
    MERGE|Worker|EBS_UK|PER_PERSON_106|106|1995/03/06|4712/12/31|||1995/03/06|1960/07/13|||||HIRE|ORA_LN_FI_DOB_NID
    MERGE|Worker|EBS_UK|PER_PERSON_648|648|2001/01/16|4712/12/31|AB-||2001/01/16|1962/02/24|||||HIRE|ORA_LN_FI_DOB_NID
    METADATA|PersonLegislativeData|SourceSystemOwner|SourceSystemId|PersonId(SourceSystemId)|EffectiveStartDate|EffectiveEndDate|LegislationCode|MaritalStatus|MaritalStatusDate|Sex
    MERGE|PersonLegislativeData|EBS_UK|PER_LDG_100|PER_PERSON_100|1993/07/03|4712/12/31|GB|M||M
    MERGE|PersonLegislativeData|EBS_UK|PER_LDG_105|PER_PERSON_105|1996/01/08|4712/12/31|GB|S||F
    METADATA|PersonName|SourceSystemOwner|SourceSystemId|PersonId(SourceSystemId)|EffectiveStartDate|EffectiveEndDate|LegislationCode|NameType|FirstName|MiddleNames|LastName|Honors|KnownAs|PreNameAdjunct|PreviousLastName|Suffix|Title
    MERGE|PersonName|EBS_UK|PER_NAME_100|PER_PERSON_100|1983/07/03|4712/12/31|GB|GLOBAL|Fred|Jeremy|Bloggs||||||MR.
    MERGE|PersonName|EBS_UK|PER_NAME_105|PER_PERSON_105|1976/01/08|4712/12/31|GB|GLOBAL|Linda|May|Roberts||||||MISS
    MERGE|PersonName|EBS_UK|PER_NAME_106|PER_PERSON_106|1985/03/06|4712/12/31|GB|GLOBAL|David|Reece|Gray||||||MR.
    MERGE|PersonName|EBS_UK|PER_NAME_111|PER_PERSON_111|1967/06/02|4712/12/31|GB|GLOBAL|Jenny|Marie|Brown||||||MISS

    ------------------------------
    Gary Johnson
    ------------------------------


  • 2.  RE: Oracle HDL format data masking

    Broadcom Employee
    Posted Jul 10, 2020 08:35 AM
    Hi Gary, could you clarify what you wrote?   Are you strictly wishing to ignore the rows with METADATA as the first field and only mask the rows with MERGE as the first field, or are you also wanting to only process the rows with MERGE in the first field AND Worker in the 2nd field?



  • 3.  RE: Oracle HDL format data masking

    Posted Jul 10, 2020 09:14 AM
    Hi Scott. So METADATA rows must never be masked. Rows beginning MERGE are candidates for masking and the masking to be applied would be determined by the record type, which is defined in the second field of these records. So in the example I provided, these record types are Worker, PersonLegislativeData or PersonName.

    Hope this clarifies.


  • 4.  RE: Oracle HDL format data masking
    Best Answer

    Broadcom Employee
    Posted Jul 10, 2020 09:00 AM
    Edited by Christopher Hackett Jul 13, 2020 04:48 PM
    Ok, I think I get what you are looking for.   You can accomplish masking only the MERGE rows by setting up the 3 layout files with the RECTYPE keyed on the 2nd field.   Then, when you are specifying the masking rules options, use a WHERE condition to select only those records with MERGE in the first field.

    Where used within FDM


    Attachment(s)

    txt
    Worker_DM.txt   120 B 1 version
    txt
    PersonName_DM.txt   128 B 1 version


  • 5.  RE: Oracle HDL format data masking

    Posted Jul 17, 2020 12:39 PM
    Hi Scott,
    That's great, thanks. One of my colleagues seems to have got that to work. However, as well as masking this type of HDL file, we are now being asked to generate synthetic data in the same format. From the documentation, the range of flat file types which can be generated seems limited and doesn't appear to cover the HDL format we require. I'm wondering whether we might need to treat each 'section' of data, as defined by the METADATA row as a separate file in .csv format, generate the data, then parse into HDL format and append the different files into one HDL file. Any advice would be appreciated.


  • 6.  RE: Oracle HDL format data masking

    Broadcom Employee
    Posted Jul 19, 2020 10:04 PM
    Hi Gary, here's how you would accomplish this:
    (1) Copy each of the existing File Definition files you created for FDM to a <same type>_HDR.txt file
    (2) Use Datamaker to Register each (in the order you want them to generate) as a "Delimited File from G-T Text File" type.   Repeat for all 6
           e.g. WORKER_HDR, then WORKER, then PERSONLD_HDR, then PERSONLD, then PERSONNM_HDR, then PERSONNM. 
    (3) TDM may add an ID column in each table during registration;   if so, select Actions on Registered Objects, right-mouse on the table, select Maintain Tables, and delete the ID row displayed for each of the tables.
    (4) In TDM Portal, create a generator including all 6 tables.    In the HDR tables, hard-code the values as seen in the example.   In the "data" tables, vary the data per your requirements.
    (5) Publish in TDM Portal.   In the Table Count column, keep all the HDR tables as 1, and set the "data" tables' Table Counts to the number of records of that type you wish to generate.
    (6) Generate to File, and select "FD - Formatted Text Files" as the file type.
    (7)  Select the Publish button and you should have a Pipe-Delimited file that conforms to the data structure you've displayed.



  • 7.  RE: Oracle HDL format data masking

    Posted Jul 24, 2020 05:42 AM
    Hi Scott, once again, thank you for your suggestions. Based on this, we have managed to get to the position where we can generate synthetic data. One issue though is that now the METADATA rows (in the _HDR files you suggested) are in separate files from the detail rows they represent. Is there a way of correctly merging these back together please?


  • 8.  RE: Oracle HDL format data masking

    Broadcom Employee
    Posted Jul 24, 2020 06:20 AM
    Gary after the publish, how many output files did you end up with?
    Billy



  • 9.  RE: Oracle HDL format data masking

    Posted Jul 24, 2020 07:56 AM
    Hi Billy. My colleague is just experimenting with one of the record types currently, Worker. The publish is generating two files, one _HDR containing the Metadata row, the other containing the actual generated detail rows.


  • 10.  RE: Oracle HDL format data masking

    Broadcom Employee
    Posted Jul 24, 2020 09:11 AM
    Hi Gary, you would concatenate these files together after the publish.   If you wish to automate, then create a Post-Publish action calling a .bat file and passing the ~PUBJOBID~ variable.    In the .bat file, use the passed param to select the correct directory containing the generated files (C:\ProgramData\CA\CA Test Data Manager Portal\Jobs\Job_%1) and use standard windows command line utilities (copy or type) to perform the concatenation.