Test Data Manager

 View Only
  • 1.  Masking using CA TDM on CDP (Cloudera Data Platform) HIVE

    Posted 14 days ago

    Hi Team,

    We have a requirement to do masking on CDP (Cloudera Data Platform) HIVE database.  Has anyone done data masking using CA TDM tool on CDP HIVE database.

    From the available documentation -https://techdocs.broadcom.com/us/en/ca-enterprise-software/devops/test-data-management/4-10/installing/supported-data-sources.html, I can see that masking on HADOOP HIVE is certified. However , I do not see anything regarding CDP HIVE.

    Please let me know if anyone came across such requirement earlier and has done something on this.

    Regards

    Prem



  • 2.  RE: Masking using CA TDM on CDP (Cloudera Data Platform) HIVE

    Posted 13 days ago

    Hi Prem,

    We used the jar files provided by CA TDM to mask the Cloudera for a migration project. These jar files includes HIVE UDFs which can be used as masking functions. There are few limitations compared to the RDBMS masking using FDM,  but it sure does the job.

    Refer the below link.

    https://techdocs.broadcom.com/us/en/ca-enterprise-software/devops/test-data-management/4-10/provisioning-test-data/mask-production-data-with-fast-data-masker/mask-stored-data/mask-data-stored-in-hadoop.html

    Regards,

    Vineeth




  • 3.  RE: Masking using CA TDM on CDP (Cloudera Data Platform) HIVE

    Posted 8 days ago

    Hi Vineeth,

    Thank you for the response. Yes , I did POC on sample table and it is working .

    I know there are limitations and it is not same as RDBMS however I have few questions, so just wondering if you had any workaround on this as well.

    1. From documentation , I do not see that we have the option of RESTARTABILITY. Did you tried any option to use the restart option by  managing via SQL or is there any way we can use this while masking in HIVE?

    2. The functions seems to be not having the option of IGNORING NULLS if the source column is having NULL value. Did you managed this somehow?

    Please help on this if possible.

    Regards

    Prem




  • 4.  RE: Masking using CA TDM on CDP (Cloudera Data Platform) HIVE

    Posted 7 days ago

    Hi Prem,

    Please find my response below.

    1. I don't think I have seen the Restartability option during our masking. We never used that functionality so didn't explore too much into that.

    2. For formatencrypt masking, NULL values were ignored. We did face some issues while doing HASHLOV for NAME fields. The workaround was to add a case statement to handle the NULL and ' ' values.

    Hope this was helpful. 

    Regards,

    Vineeth




  • 5.  RE: Masking using CA TDM on CDP (Cloudera Data Platform) HIVE

    Posted 3 days ago
    Edited by Premlal Digarse 3 days ago

    Thank you for response Vineet. We will explore based on our requirement.

    Regards

    Prem