Cloud DLP Data Profiler in this lab so that it can awhich will automatically scan all BigQuery tables and columns across the entire organization, individual folders, and projects. It then creates data profiles at table. column and project level.
Lab2-data-security successfully completed.
~15 mins
Templates for saving configuration information for inspection scan jobs, including what predefined or custom detectors to use.
Cloud Data Loss Prevention uses information types—or infoTypes—to define what it scans for. An infoType is a type of sensitive data, such as a name, email address, telephone number, identification number, credit card number, and so on. An infoType detector is the corresponding detection mechanism that matches on an infoType's matching criteria.
In this lab, we'll use a Terraform template to configure the DLP Data profiler. We will carry out steps 1, 2, and 3 of this lab based on the architecture diagram below. Step 4,5 & 7 will be done in the lab7-register-data-product.
In a future release, will add a lab for sending Cloud DLP results to Dataplex Catalog
Data Profiles
DLP result Analysis
Send DLP Results To Catalog
Follow the below instructions to setup the DLP Auto profiler job.
-
Step1: Add IAM permissions for DLP Service Account
Open Cloud Shell and run the below command:
export PROJECT_ID=$(gcloud config get-value project) export project_num=$(gcloud projects list --filter="${PROJECT_ID}" --format="value(PROJECT_NUMBER)") gcloud projects add-iam-policy-binding ${PROJECT_ID} --member="serviceAccount:service-${project_num}@dlp-api.iam.gserviceaccount.com" --role="roles/dlp.admin"
-
Step2: Go to "Data Loss Prevention" service under Security in Google Console
-
Step3: Click on "SCAN CONFIGURATIONS" tab
-
Step4: Click on +CREATE CONFIGURATION
-
Step5: For select resource to scan select Scan the entire project
-
Step6: Click Continue
-
Step7: Under Manage Schedules (Optional)
-
Step8: Under Select inspection template
-
Choose “Select existing inspection template” and provide this value
template name: projects/${PROJECT_ID}/inspectTemplates/marsbank_dlp_template
location: global then click "Continue"
-
-
Step9: Under “Add Actions”
-
Step10: Under Set location to store configuration Resource location: Iowa (us-central1) Click continue
-
Step11: Leave "Review and Create" at default and click Create
-
Step12: Make sure configuration has been successfully created
-
Step13 After a few minutes check to make sure the data profile is available in the "DATA PROFILE" tab, choose the Iowa region and the central_dlp_table dataset has been poupulated in Bigquery. Meanwhile feel free to move to the next lab.
In a later lab, we will use these results to annotate the Data products with the Data classification info.
This concludes the lab module. Either proceed to the main menu or to the next module where you will learn to implement data quality using Dataplex.