Test Data Manager

 View Only
  • 1.  Masking using CA TDM on CDP (Cloudera Data Platform) HIVE

    Posted Feb 08, 2024 06:12 AM

    Hi Team,

    We have a requirement to do masking on CDP (Cloudera Data Platform) HIVE database.  Has anyone done data masking using CA TDM tool on CDP HIVE database.

    From the available documentation -https://techdocs.broadcom.com/us/en/ca-enterprise-software/devops/test-data-management/4-10/installing/supported-data-sources.html, I can see that masking on HADOOP HIVE is certified. However , I do not see anything regarding CDP HIVE.

    Please let me know if anyone came across such requirement earlier and has done something on this.

    Regards

    Prem



  • 2.  RE: Masking using CA TDM on CDP (Cloudera Data Platform) HIVE

    Posted Feb 09, 2024 09:47 AM

    Hi Prem,

    We used the jar files provided by CA TDM to mask the Cloudera for a migration project. These jar files includes HIVE UDFs which can be used as masking functions. There are few limitations compared to the RDBMS masking using FDM,  but it sure does the job.

    Refer the below link.

    https://techdocs.broadcom.com/us/en/ca-enterprise-software/devops/test-data-management/4-10/provisioning-test-data/mask-production-data-with-fast-data-masker/mask-stored-data/mask-data-stored-in-hadoop.html

    Regards,

    Vineeth




  • 3.  RE: Masking using CA TDM on CDP (Cloudera Data Platform) HIVE

    Posted Feb 14, 2024 08:07 AM

    Hi Vineeth,

    Thank you for the response. Yes , I did POC on sample table and it is working .

    I know there are limitations and it is not same as RDBMS however I have few questions, so just wondering if you had any workaround on this as well.

    1. From documentation , I do not see that we have the option of RESTARTABILITY. Did you tried any option to use the restart option by  managing via SQL or is there any way we can use this while masking in HIVE?

    2. The functions seems to be not having the option of IGNORING NULLS if the source column is having NULL value. Did you managed this somehow?

    Please help on this if possible.

    Regards

    Prem




  • 4.  RE: Masking using CA TDM on CDP (Cloudera Data Platform) HIVE

    Posted Feb 15, 2024 09:14 AM

    Hi Prem,

    Please find my response below.

    1. I don't think I have seen the Restartability option during our masking. We never used that functionality so didn't explore too much into that.

    2. For formatencrypt masking, NULL values were ignored. We did face some issues while doing HASHLOV for NAME fields. The workaround was to add a case statement to handle the NULL and ' ' values.

    Hope this was helpful. 

    Regards,

    Vineeth




  • 5.  RE: Masking using CA TDM on CDP (Cloudera Data Platform) HIVE

    Posted Feb 19, 2024 07:00 AM
    Edited by Premlal Digarse Feb 19, 2024 07:00 AM

    Thank you for response Vineet. We will explore based on our requirement.

    Regards

    Prem