Test Data Manager

Expand all | Collapse all

FDM 4.8.125 RANDLOV Performance

  • 1.  FDM 4.8.125 RANDLOV Performance

    Posted 12-03-2019 04:02 PM
    Has anyone noticed any performance issues with the RANDLOV function in FDM 4.8.125?

    I am finding that even the audit is running slowly.
    The target is a SQL Server 2016 DB.
    HASHLOV1 on the other hand is running quite rapidly for millions of records.


  • 2.  RE: FDM 4.8.125 RANDLOV Performance

    Posted 12-03-2019 04:22 PM
    It's approximately taking 2.5hrs for a 100k records.
    In comparison, the HASHLOV1 function on the same dataset took 2hours and 20mins to mask 4.6M records (~3mins/100k records), which is 50 times faster.


  • 3.  RE: FDM 4.8.125 RANDLOV Performance

    Posted 12-04-2019 07:48 AM
    Hi Navin,  some more questions:

    What is the datatype of the column you are trying to mask?
    Is the only variable between one run & the other is that you changed from HASHLOV1 to RANDLOV?
    Did you select RANDLOV or RANDLOV1?

    Can you try the latest (4.8.149) FDM?

    Scott


  • 4.  RE: FDM 4.8.125 RANDLOV Performance

    Posted 12-04-2019 11:28 AM
    Hi Scott,

    The data types of the columns I am masking are all varchar.

    The only variable between the 2 runs is that I changed from HASHLOV1 to RANDLOV.
    Is RANDOV1 a more efficient algorithm? I have always assumed that HASHLOV requires more computational effort than RANDLOV.

    I have also noticed that in the version I am using, for the RANDLOV funtion it output a series of random numbers (I am assuming seed-value identifier), instead of the 1000 rows updated, which is still present in HASHLOV/HASHLOV1.

    I am happy to try the latest release, hopefully that will resolve.

    Cheers,
    Navin


  • 5.  RE: FDM 4.8.125 RANDLOV Performance

    Posted 12-04-2019 02:26 AM
    Hi Navin,

    Is the used function the only diference? Possible factors of influence:
    - Using the UniqueColumn option
    - Indexed keys
    - Size of the used seed-lists
    And was the RANDLOV function faster in a previous version with the same attribute?

    Best regards,
    Klaas-Jan


  • 6.  RE: FDM 4.8.125 RANDLOV Performance

    Posted 12-04-2019 11:31 AM
    Hi Klaas,

    There is a unique column and the same seedlist is used for both masking runs.
    The RANDLOV was definitely faster in the past, but I can't recall in which version this changed.

    Cheers,
    Navin


  • 7.  RE: FDM 4.8.125 RANDLOV Performance

    Posted 12-05-2019 04:03 AM
    Hi Navin,

    Have you indicated the unique column in FDM as Unique Column (available under Extra Options)?

    Best regards,
    Klaas-Jan