Email Connector: Send Email with Erasure Instructions [#1158] #1246

pattisdr · 2022-09-01T22:19:35Z

❗ Contains migration
❗ ~~Don't merge until #1240 is resolved~~

Purpose

Adds the final PR for the EmailConnector. An EmailConnector sends an email at the end of privacy request execution to third parties with instructions on how to mask data that we can't access automatically. Also introduces the concept of "checkpoints" for privacy request execution to give us more locations from which to resume a paused/failed privacy request.
This allows us to retry just the email send and everything else downstream if the email send fails, and makes it easier to add other checkpoints in the future.

Changes

Run a new method: email_connector_erasure_send after the access and erasure steps of privacy request execution. It collects data that was cached by the traversal for any collections on email-related datasets, and fires off a single email for that dataset.
Adds a new AuditLog type: "email_sent", and creates an AuditLog provided the email send succeeds.
Have the EmailConnector.mask_data raise a specific Exception just to override the ExecutionLog message for the collection, to indicate that the email was prepared. The email itself is not sent until later. ExecutionLog creation is the responsibility of the GraphTask, not an individual email connector. We use similar exceptions to "pause" a privacy request.
Add the concept of "checkpoints" in PrivacyRequest execution to formalize restarting from certain points in the execution. We already could resume a privacy request from a Pre-Execution Webhook or from a particular collection in the access or erasure graph, but we'd like to be able to retry just the email send portion of this. This makes the checkpoint logic more consistent.
Allows configuring a test_email on the EmailConnector secrets, so the fidesops admin can send a test email to themselves with test data in it, to confirm their email config is setup properly.
Rename some of the methods and classes related to serializing different "checkpoints". Previously, you could generally just resume request execution from a specific collection in the traversal, but we want to make a collection not necessary, if the resume point is outside of the traversal.

Checklist

Ticket

Fixes #1158

…est execution. - Add a migration to create a new audit log type. Create an audit log for the email send. - Throw an exception for email-based connectors and catch to override the default execution log. - Add a draft of an email template - Connect sending a "test email" with dummy data. A fidesops admin could configure to check their email config was working.

…ions from which we can resume privacy request execution without having to run from the beginning. - Add more options to CurrentStep Enum - Cache the checkpoint if an email send fails, so we can retry from the same step.

…e no updates to be applied to any of the collections on the dataset.

…ailed privacy request can be resumed outside of the traversal.

pattisdr · 2022-09-02T15:58:10Z

Hi @ethyca/docs-authors, can you lend your expertise in helping me with the email copy here?

In short, this is an Email Connector that sends an email to a third party telling them how to erase data that we can't erase automatically.

There may be multiple collections that they should delete data from, and multiple fields/values they should query on their collection to find the records that should be masked, with multiple fields that need to be deleted. It is possible that they need to look up data on one of their collections to find data in another one of their collections. It is possible that some of the collections have no data to be erased but they are included because they need to locate a record on that collection in order to find data on another collection.

However, the scenario where there are multiple collections is probably less likely. We imagine the fidesops user won't have good knowledge of the third party's data structure, and will create one collection with the fields they should mask.

In my email example below, the customer has two collections, children and daycare_customer. There are several fields they need to erase on the children collection, but they need to first look up the daycare_customer with an id of 1, and then use that to query matching parent_ids on children. There are no fields to erase on the daycare_customer table, but it's included as a locator for children.

pattisdr · 2022-09-02T18:28:29Z

@ethyca/docs-authors I've also added a first draft of docs for the email connector here -

…onnector_email_send # Conflicts: # CHANGELOG.md

pattisdr · 2022-09-02T21:38:26Z

src/fidesops/ops/models/privacy_request.py


-class CollectionActionRequired(BaseSchema):
-    """Describes actions needed on a given collection.
+class CheckpointActionRequired(BaseSchema):
+    """Describes actions needed on a particular checkpoint.

    Examples are a paused collection that needs manual input, a failed collection that
    needs to be restarted, or a collection where instructions need to be emailed to a third
    party to complete the request.
    """

    step: CurrentStep
-    collection: CollectionAddress
+    collection: Optional[CollectionAddress]


I'm just using this existing structure that we already have to cache details about a request that paused or failed in the traversal to extend it to describe when a request fails outside of the traversal. I'm removing the mandate that a collection exists, because this is only relevant inside the traversal.

I've renamed several related pieces.

pattisdr · 2022-09-02T21:40:52Z

src/fidesops/ops/api/v1/endpoints/privacy_request_endpoints.py

+    if not paused_collection:
+        raise HTTPException(
+            status_code=HTTP_422_UNPROCESSABLE_ENTITY,
+            detail="Cannot save manual data on paused collection. No paused collection saved'.",
+        )
+


I'm just adding a new check here now that I've generally removed the requirement that a collection is saved on a CheckpointActionRequired but this is one place where it's needed.

(To be clear, this shouldn't ever be hit though)

pattisdr · 2022-09-02T21:42:09Z

src/fidesops/ops/models/privacy_request.py

+def can_run_checkpoint(
+    request_checkpoint: CurrentStep, from_checkpoint: Optional[CurrentStep] = None
+) -> bool:
+    """Determine whether we should run a specific checkpoint in privacy request execution
+
+    If there's no from_checkpoint specified we should always run the current checkpoint.
+    """
+    if not from_checkpoint:
+        return True
+    return EXECUTION_CHECKPOINTS.index(
+        request_checkpoint
+    ) >= EXECUTION_CHECKPOINTS.index(from_checkpoint)


We already had this ability in privacy request execution to resume from a specific point in the cause of a pause or failure, but this formalizes it a bit, and makes it easier to extend now that we are starting to add multiple locations from which a privacy request can be resumed.

I like this abstraction. Makes it easier to understand as we scale the number of execution checkpoints 💯

pattisdr · 2022-09-02T21:43:05Z

src/fidesops/ops/service/connectors/email_connector.py

+        config = EmailSchema(**self.configuration.secrets or {})
+        logger.info("Starting test connection to %s", self.configuration.key)
+
+        db = Session.object_session(self.configuration)
+
+        try:
+            dispatch_email(
+                db=db,
+                action_type=EmailActionType.EMAIL_ERASURE_REQUEST_FULFILLMENT,
+                to_email=config.test_email,
+                email_body_params={
+                    "test_collection": CheckpointActionRequired(
+                        step=CurrentStep.erasure,
+                        collection=CollectionAddress("test_dataset", "test_collection"),
+                        action_needed=[
+                            ManualAction(
+                                locators={"id": ["example_id"]},
+                                get=None,
+                                update={
+                                    "test_field": "null_rewrite",
+                                },
+                            )
+                        ],
+                    )
+                },
+            )
+        except EmailDispatchException as exc:
+            logger.info("Email connector test failed with exception %s", Pii(exc))
+            return ConnectionTestStatus.failed
+        return ConnectionTestStatus.succeeded


If a test_email is configured (you'd set up a test_email that you had access to), you can use this to send a test email to yourself. The data inside is just dummy data. It really just confirms your email config is working.

pattisdr · 2022-09-02T21:45:02Z

src/fidesops/ops/service/privacy_request/request_runner_service.py

+        # Send erasure requests via email to third parties where applicable
+        if can_run_checkpoint(
+            request_checkpoint=CurrentStep.erasure_email_post_send,
+            from_checkpoint=resume_step,
+        ):
+            try:
+                email_connector_erasure_send(
+                    db=session, privacy_request=privacy_request
+                )
+            except EmailDispatchException as exc:
+                privacy_request.cache_failed_checkpoint_details(
+                    step=CurrentStep.erasure_email_post_send, collection=None
+                )
+                privacy_request.error_processing(db=session)
+                await fideslog_graph_failure(
+                    failed_graph_analytics_event(privacy_request, exc)
+                )
+                # If dev mode, log traceback
+                _log_exception(exc, config.dev_mode)
+                return
+


This is the primary change of this PR: if applicable, we combine cached data from all the collections on an email-based dataset and send a single email for each dataset. There's no batching currently. If there is no data to mask / the connection was read-only, etc. no email is sent.

pattisdr · 2022-09-02T21:47:23Z

src/fidesops/ops/task/graph_task.py

+                except PrivacyRequestErasureEmailSendRequired as exc:
+                    self.log_end(action_type, ex=None, success_override_msg=exc)
+                    self.resources.cache_erasure(
+                        f"{self.traversal_node.address.value}", 0
+                    )  # Cache that the erasure was performed in case we need to restart
+                    return 0


Following a pattern we use elsewhere - I raise an Exception in the EmailConnector.mask_data to update the execution log as it is a concern of the GraphTask and not the individual EmailConnectors. In this case, I want to add a more specific message to the erasure ExecutionLog as the action is not technically completed until we send the email at the end.

pattisdr · 2022-09-02T21:54:05Z

src/fidesops/ops/service/privacy_request/request_runner_service.py

@@ -102,6 +111,7 @@ def run_webhooks_and_report_status(
                webhook.key,
            )
            privacy_request.error_processing(db)
+            privacy_request.cache_failed_checkpoint_details(current_step)


Using our new request execution checkpoints this lets us resume from the pre or post execution webhooks step later if they fail. We used to be able to resume from pause but had no handling to resume from failure.

conceptualshark · 2022-09-06T18:01:27Z

@pattisdr I ran through and combined the two email guides and updated them to match the formatting - let me know if this works for you?

pattisdr · 2022-09-06T18:35:41Z

docs/fidesops/docs/guides/email_communications.md

- Subject Identity Verification - for more information on identity verification in subject requests, see the [Privacy Requests](privacy_requests.md#subject-identity-verification) guide.
- Erasure Request Email Fulfillment - sends an email to configured third parties to process erasures for a given data subject.  See [Email Connectors](email_connectors.md) for more information.
+- Subject Identity Verification - sends a verification code to the user's email address prior to  for more information on identity verification in subject requests, see the [Privacy Requests](privacy_requests.md#subject-identity-verification) guide.
+- Erasure Request Email Fulfillment - sends an email to configured third parties to process erasures for a given data subject.  See [creating Email Connectors](#create-an-email-connector) for more information.


@conceptualshark I don't think this goes anywhere

pattisdr · 2022-09-06T18:41:28Z

docs/fidesops/docs/guides/email_communications.md


 Supported modes of use:

- Subject Identity Verification - for more information on identity verification in subject requests, see the [Privacy Requests](privacy_requests.md#subject-identity-verification) guide.
- Erasure Request Email Fulfillment - sends an email to configured third parties to process erasures for a given data subject.  See [Email Connectors](email_connectors.md) for more information.
+- Subject Identity Verification - sends a verification code to the user's email address prior to  for more information on identity verification in subject requests, see the [Privacy Requests](privacy_requests.md#subject-identity-verification) guide.


@conceptualshark Sentence doesn't seem complete... "prior to executing a privacy request" or similar?

pattisdr · 2022-09-06T18:45:50Z

docs/fidesops/docs/guides/email_communications.md

+| Field | Description |
+|----|----|
+| `key` | A unique key used to manage your email connector. This is auto-generated from `name` if left blank. Accepted values are alphanumeric, `_`, and `.`. |
+| `name` | A unique user-friendly name for your email connector. |
+| `connection_type` | Must be `email` to create a new email connector. |
+| `access` | Email connectors must be given `write` access in order to send an email. |
+


Nice formatting here will try to remember this for future docs drafts @conceptualshark

Thanks for catching these - ran through a bit fast 😨

eastandwestwind

Great work @pattisdr ! I've manually tested that:

The test email was successfully sent upon adding secrets to the email connector:

2. The actual erasure email was successfully sent upon executing an erasure request:

Nothing blocking, just minor syntax issue in docs.

Note for future - this is another feature that relies on an email config being set up, and it's currently possible for email config not found exceptions to happen at run time during a privacy request. To prevent this, we can have Admin UI check for a valid email config at any point in the setup flow

eastandwestwind · 2022-09-06T18:37:09Z

src/fidesops/ops/service/connectors/email_connector.py

+        db = Session.object_session(self.configuration)
+
+        try:
+            dispatch_email(


just keeping an eye on #1173 for possible conflicts depending on which is merged first

👍 thanks for the callout

eastandwestwind · 2022-09-06T18:44:28Z

src/fidesops/ops/models/privacy_request.py

@@ -380,14 +389,14 @@ def cache_email_connector_template_contents(

    def get_email_connector_template_contents_by_dataset(
        self, step: CurrentStep, dataset: str
-    ) -> Dict[str, Optional[CollectionActionRequired]]:
+    ) -> Dict[str, Optional[CheckpointActionRequired]]:


can we create a new class for Dict[str, Optional[CheckpointActionRequired]]? I'd prefer to be explicit on what the str represents here

sure thing, added a new type EmailRequestFulfillmentBodyParams that is created from template data we pull out of the cache by dataset

eastandwestwind · 2022-09-06T18:50:45Z

src/fidesops/ops/models/privacy_request.py

+def can_run_checkpoint(
+    request_checkpoint: CurrentStep, from_checkpoint: Optional[CurrentStep] = None
+) -> bool:
+    """Determine whether we should run a specific checkpoint in privacy request execution
+
+    If there's no from_checkpoint specified we should always run the current checkpoint.
+    """
+    if not from_checkpoint:
+        return True
+    return EXECUTION_CHECKPOINTS.index(
+        request_checkpoint
+    ) >= EXECUTION_CHECKPOINTS.index(from_checkpoint)


I like this abstraction. Makes it easier to understand as we scale the number of execution checkpoints 💯

eastandwestwind · 2022-09-06T19:24:40Z

docs/fidesops/docs/guides/email_communications.md

+```json title="<code>PUT api/v1/connection/{email_connection_config_key}/secret</code>" 
+{
+    "test_email": "[email protected]",
+    "to_email": "[email protected]


missing quote "

…he cached email details are extracted by dataset.

…onnector_email_send # Conflicts: # tests/ops/integration_tests/test_integration_email.py

pattisdr · 2022-09-06T22:18:02Z

Comments addressed; back to you @eastandwestwind!

eastandwestwind · 2022-09-07T14:29:59Z

thanks for addressing the comments, looks great @pattisdr !

conceptualshark · 2022-09-07T14:46:13Z

@pattisdr I think the basic collection/field sections are probably fine as-is - could we add something more comprehensive to explain its purpose, and could we add in the org name to the template?

This is an automated email sent by {ORGANIZATION NAME} using Fides. Customers of {ORGANIZATION NAME} would like to exercise their data privacy right to be deleted.

Please locate and erase personally identifiable information for all data subjects and records listed below:

And then the list you already have generating, I think.

pattisdr · 2022-09-07T14:48:59Z

great improvements, thank you @conceptualshark, I'll add this in a follow-up ticket

* Send an email for each email-based dataset at the end of privacy request execution. - Add a migration to create a new audit log type. Create an audit log for the email send. - Throw an exception for email-based connectors and catch to override the default execution log. - Add a draft of an email template - Connect sending a "test email" with dummy data. A fidesops admin could configure to check their email config was working. * Add more "checkpoints" to privacy request execution - these are locations from which we can resume privacy request execution without having to run from the beginning. - Add more options to CurrentStep Enum - Cache the checkpoint if an email send fails, so we can retry from the same step. * Don't send an email if the connection config is read only or there are no updates to be applied to any of the collections on the dataset. * Don't assume there's a collection when building "resume" details. A failed privacy request can be resumed outside of the traversal. * Add a first draft of docs for setting up an email connector. * Moves the email connector send method to the email connector file. * Update mock location. * Bump downrev. * update email connector guides * correct link, broken sentence * Create a new EmailRequestFulfillmentBodyParams type to be used once the cached email details are extracted by dataset. * Fix missed test. Co-authored-by: Cole <[email protected]>

pattisdr added 2 commits September 1, 2022 13:18

pattisdr changed the title ~~[DRAFT] Email Connector: Send Email [#1158]~~ [DRAFT] Email Connector: Send Email with Erasure Instructions [#1158] Sep 1, 2022

Don't send an email if the connection config is read only or there ar…

1601293

…e no updates to be applied to any of the collections on the dataset.

seanpreston assigned eastandwestwind Sep 2, 2022

Don't assume there's a collection when building "resume" details. A f…

f539a01

…ailed privacy request can be resumed outside of the traversal.

Add a first draft of docs for setting up an email connector.

e1dc1fb

pattisdr changed the title ~~[DRAFT] Email Connector: Send Email with Erasure Instructions [#1158]~~ Email Connector: Send Email with Erasure Instructions [#1158] Sep 2, 2022

pattisdr added 4 commits September 2, 2022 16:04

Moves the email connector send method to the email connector file.

c09b4e7

Update mock location.

c8afe55

Merge remote-tracking branch 'ethyca/main' into fidesops_1158_email_c…

b10f6de

…onnector_email_send # Conflicts: # CHANGELOG.md

Bump downrev.

d9e169a

pattisdr commented Sep 2, 2022

View reviewed changes

pattisdr marked this pull request as ready for review September 2, 2022 22:01

update email connector guides

10ba182

pattisdr commented Sep 6, 2022

View reviewed changes

correct link, broken sentence

1e855d0

eastandwestwind reviewed Sep 6, 2022

View reviewed changes

pattisdr added 3 commits September 6, 2022 16:17

Create a new EmailRequestFulfillmentBodyParams type to be used once t…

d70d71a

…he cached email details are extracted by dataset.

Fix missed test.

9a20d02

Merge remote-tracking branch 'ethyca/main' into fidesops_1158_email_c…

02eb668

…onnector_email_send # Conflicts: # tests/ops/integration_tests/test_integration_email.py

eastandwestwind approved these changes Sep 7, 2022

View reviewed changes

eastandwestwind merged commit c49b426 into main Sep 7, 2022

eastandwestwind deleted the fidesops_1158_email_connector_email_send branch September 7, 2022 14:30

eastandwestwind mentioned this pull request Sep 7, 2022

[#1088] Adds new Celery queue for async email dispatch #1173

Merged

10 tasks

This was referenced Sep 7, 2022

Email Connector: Update Email Copy #1265

Closed

Update the Erasure Request Email Fulfillment template [#1265] #1270

Merged

mfbrown mentioned this pull request Dec 7, 2022

Add email connector option to the DSR connections UI. ethyca/fides#1906

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Email Connector: Send Email with Erasure Instructions [#1158] #1246

Email Connector: Send Email with Erasure Instructions [#1158] #1246

pattisdr commented Sep 1, 2022 •

edited

Loading

pattisdr commented Sep 2, 2022 •

edited

Loading

pattisdr commented Sep 2, 2022

pattisdr Sep 2, 2022

pattisdr Sep 2, 2022 •

edited

Loading

pattisdr Sep 2, 2022

eastandwestwind Sep 6, 2022

pattisdr Sep 2, 2022

pattisdr Sep 2, 2022 •

edited

Loading

pattisdr Sep 2, 2022 •

edited

Loading

pattisdr Sep 2, 2022

conceptualshark commented Sep 6, 2022

pattisdr Sep 6, 2022

pattisdr Sep 6, 2022

pattisdr Sep 6, 2022

conceptualshark Sep 6, 2022

eastandwestwind left a comment

eastandwestwind Sep 6, 2022

pattisdr Sep 6, 2022

eastandwestwind Sep 6, 2022

pattisdr Sep 6, 2022

eastandwestwind Sep 6, 2022

eastandwestwind Sep 6, 2022

pattisdr commented Sep 6, 2022

eastandwestwind commented Sep 7, 2022

conceptualshark commented Sep 7, 2022

pattisdr commented Sep 7, 2022

Email Connector: Send Email with Erasure Instructions [#1158] #1246

Email Connector: Send Email with Erasure Instructions [#1158] #1246

Conversation

pattisdr commented Sep 1, 2022 • edited Loading

Purpose

Changes

Checklist

Ticket

pattisdr commented Sep 2, 2022 • edited Loading

pattisdr commented Sep 2, 2022

Choose a reason for hiding this comment

pattisdr Sep 2, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pattisdr Sep 2, 2022 • edited Loading

Choose a reason for hiding this comment

pattisdr Sep 2, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

conceptualshark commented Sep 6, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

eastandwestwind left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pattisdr commented Sep 6, 2022

eastandwestwind commented Sep 7, 2022

conceptualshark commented Sep 7, 2022

pattisdr commented Sep 7, 2022

pattisdr commented Sep 1, 2022 •

edited

Loading

pattisdr commented Sep 2, 2022 •

edited

Loading

pattisdr Sep 2, 2022 •

edited

Loading

pattisdr Sep 2, 2022 •

edited

Loading

pattisdr Sep 2, 2022 •

edited

Loading