Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LA 83 Fix Salesforce Erasure Data Flow #5452

Merged
merged 14 commits into from
Nov 6, 2024

Conversation

Vagoasdf
Copy link
Contributor

@Vagoasdf Vagoasdf commented Nov 4, 2024

Closes LA#83

Pairs With fidesplus#1708

Description Of Changes

Started as a removal of the Account endpoint since that was data from the Business, not a person. Ended up creating custom overrides requests so the validation rules for the other endpoints were passing.

currently the Requests Overrides do two things

  • Checks if the field is an email. If it is, changes it to masked@company.com
  • Checks if the field is over 40 characters after masking. If it is, truncating it

Code Changes

  • Request overrides for the Salesforce endpoints created

Steps to Confirm

  • Create a new Business Account on Salesforces from the Admin Template
  • Create two contacts for that Business Account
  • Set up the Salesforces Integration
  • Mask one of the Contacts by Email
  • The Other Contact should not be masked
  • The Account should not be masked

imagen

Pre-Merge Checklist

  • All CI Pipelines Succeeded
  • Issue Requirements are Met
  • Update CHANGELOG.md

@Vagoasdf Vagoasdf requested a review from a team as a code owner November 4, 2024 15:09
Copy link

vercel bot commented Nov 4, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

1 Skipped Deployment
Name Status Preview Comments Updated (UTC)
fides-plus-nightly ⬜️ Ignored (Inspect) Visit Preview Nov 6, 2024 5:57pm

@Vagoasdf Vagoasdf changed the title LA 82 Fix Salesforce Erasure Data Flow LA 83 Fix Salesforce Erasure Data Flow Nov 4, 2024
Copy link

cypress bot commented Nov 4, 2024

fides    Run #10844

Run Properties:  status check passed Passed #10844  •  git commit 30b8c335ed ℹ️: Merge f03cec6daaa9ecd221b5771380106cf5a9c36887 into 0234347960163e2b413a7f6ba5cf...
Project fides
Branch Review refs/pull/5452/merge
Run status status check passed Passed #10844
Run duration 00m 39s
Commit git commit 30b8c335ed ℹ️: Merge f03cec6daaa9ecd221b5771380106cf5a9c36887 into 0234347960163e2b413a7f6ba5cf...
Committer Bruno Gutierrez Rios
View all properties for this run ↗︎

Test results
Tests that failed  Failures 0
Tests that were flaky  Flaky 0
Tests that did not run due to a developer annotating a test with .skip  Pending 0
Tests that did not run due to a failure in a mocha hook  Skipped 0
Tests that passed  Passing 4
View all changes introduced in this branch ↗︎

Copy link

codecov bot commented Nov 4, 2024

Codecov Report

Attention: Patch coverage is 29.62963% with 38 lines in your changes missing coverage. Please review.

Project coverage is 85.30%. Comparing base (0234347) to head (f03cec6).
Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
...de_implementations/salesforce_request_overrides.py 29.62% 38 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #5452      +/-   ##
==========================================
- Coverage   85.42%   85.30%   -0.13%     
==========================================
  Files         385      386       +1     
  Lines       24152    24206      +54     
  Branches     2630     2636       +6     
==========================================
+ Hits        20633    20649      +16     
- Misses       2966     3004      +38     
  Partials      553      553              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

CHANGELOG.md Outdated
@@ -21,6 +21,9 @@ The types of changes are:
- Added DataHub integration config [#5401](https://github.com/ethyca/fides/pull/5401)
- Added keepalive settings to the Redshift integration [#5433](https://github.com/ethyca/fides/pull/5433)

### Fixed
- Updating Salesforce erasure request with overrides so it properly passes validation. Removing Account endpoint since it does not represents PII [#5452](https://github.com/ethyca/fides/pull/5452)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- Updating Salesforce erasure request with overrides so it properly passes validation. Removing Account endpoint since it does not represents PII [#5452](https://github.com/ethyca/fides/pull/5452)
- Updating Salesforce erasure request with overrides so it properly passes validation. Removing Account endpoint since it does not contain user data [#5452](https://github.com/ethyca/fides/pull/5452)

# using the privacy request id to have an unique id
def maskEmail(masked_object_fields: Dict, email_field: str) -> Dict:
if email_field in masked_object_fields:
masked_object_fields[email_field] = "Masked@company.com"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
masked_object_fields[email_field] = "Masked@company.com"
masked_object_fields[email_field] = "masked@company.com"

Comment on lines 18 to 26
def truncateFieldsTo40Characters(masked_object_fields: Dict) -> Dict:
for key in masked_object_fields:
logger.info(key)
logger.info((masked_object_fields[key]))
if not isinstance(masked_object_fields[key], str):
continue
if len(masked_object_fields[key]) > 40:
masked_object_fields[key] = masked_object_fields[key][:40]
return masked_object_fields
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can simplify this a bit and only log a message when we need to truncate. The comment you added is helpful but it would make sense to add it as a doc string. Finally, make sure to use snake case since that's the convention for Python.

Suggested change
def truncateFieldsTo40Characters(masked_object_fields: Dict) -> Dict:
for key in masked_object_fields:
logger.info(key)
logger.info((masked_object_fields[key]))
if not isinstance(masked_object_fields[key], str):
continue
if len(masked_object_fields[key]) > 40:
masked_object_fields[key] = masked_object_fields[key][:40]
return masked_object_fields
def truncate_fields(masked_object_fields: Dict) -> Dict:
"""
Check if the masked field is over 40 characters long, if so truncate it to 40 characters.
"""
for key in masked_object_fields:
value = masked_object_fields[key]
if isinstance(value, str) and len(value) > 40:
logger.info("Truncating {key} field to 40 characters")
masked_object_fields[key] = value[:40]
return masked_object_fields

Comment on lines 29 to 31
# Masking Email properly so it does not breaks validation rules
# using the privacy request id to have an unique id
def maskEmail(masked_object_fields: Dict, email_field: str) -> Dict:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above

Suggested change
# Masking Email properly so it does not breaks validation rules
# using the privacy request id to have an unique id
def maskEmail(masked_object_fields: Dict, email_field: str) -> Dict:
def mask_email(masked_object_fields: Dict, email_field: str) -> Dict:
"""Masking email fields properly so they don't break validation rules."""

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know we have these tests flagged as @pytest.mark.skip(reason="Currently unable to test OAuth2 connectors") but can you try the following:

  1. Run the tests by hard-coding the access_token in your local saas_config.toml?
  2. Use the erasure_policy_string_rewrite_name_and_email policy fixture so we can see the email being masked.
  3. Use the new ConnectorRunner pattern for the task tests
  4. Verify we have multiple users in Salesforce to verify we only return results for one user
  5. Verify that multiple entities can have the same email (the masked@company email we're using)

@Vagoasdf
Copy link
Contributor Author

Vagoasdf commented Nov 5, 2024

By the looks of the Campaings Member API docs, it looks like the Campaing Members represents a relationship between Leads and Contacts, and as such, their fields should be updated on their respective endpoints. We already have some fields on the campaign_members dataset collection as read_only, and those that are not, are throwing an 400 error as we are not able to update them.

removing the Update endpoint on campaign member. We still want to read it, even if it's a bit of duplicate data, because its relevant to see if its on a campaign, but it should not be updated at all. As we can see on the Screenshots below, updating the Contact and Lead would update the campaing member view correctly

Pasted image 20241106101019

Pasted image 20241106103615

Campaign Members are a relationship between a campaign and a lead or a contact. Updating  the latter two would update user data
@Vagoasdf Vagoasdf requested a review from galvana November 6, 2024 16:04


@pytest.mark.skip(reason="Currently unable to test OAuth2 connectors")
#@pytest.mark.skip(reason="Currently unable to test OAuth2 connectors")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
#@pytest.mark.skip(reason="Currently unable to test OAuth2 connectors")
@pytest.mark.skip(reason="Currently unable to test OAuth2 connectors")

@Vagoasdf Vagoasdf merged commit 0e36182 into main Nov 6, 2024
16 of 18 checks passed
@Vagoasdf Vagoasdf deleted the LA-82-Remove-salesforce-account-endpint branch November 6, 2024 18:12
Copy link

cypress bot commented Nov 6, 2024

fides    Run #10846

Run Properties:  status check passed Passed #10846  •  git commit 0e36182fd8: LA 83 Fix Salesforce Erasure Data Flow (#5452)
Project fides
Branch Review main
Run status status check passed Passed #10846
Run duration 00m 38s
Commit git commit 0e36182fd8: LA 83 Fix Salesforce Erasure Data Flow (#5452)
Committer Bruno Gutierrez Rios
View all properties for this run ↗︎

Test results
Tests that failed  Failures 0
Tests that were flaky  Flaky 0
Tests that did not run due to a developer annotating a test with .skip  Pending 0
Tests that did not run due to a failure in a mocha hook  Skipped 0
Tests that passed  Passing 4
View all changes introduced in this branch ↗︎

@Vagoasdf Vagoasdf restored the LA-82-Remove-salesforce-account-endpint branch November 6, 2024 18:35
@Vagoasdf Vagoasdf mentioned this pull request Nov 6, 2024
11 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants