-
Notifications
You must be signed in to change notification settings - Fork 74
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bigquery Row-Level Deletes + Erase After on Database Collections #5293
Conversation
The latest updates on your projects. Learn more about Vercel for Git ↗︎ 1 Skipped Deployment
|
fides Run #10122
Run Properties:
|
Project |
fides
|
Branch Review |
refs/pull/5293/merge
|
Run status |
Passed #10122
|
Run duration | 00m 39s |
Commit |
d8b0787ae5 ℹ️: Merge 992f9e9e989dc9d7a892b485c818a4b7c073b5c0 into 155e1fd8bcf545599af273b06a62...
|
Committer | Dawn Pattison |
View all properties for this run ↗︎ |
Test results | |
---|---|
Failures |
0
|
Flaky |
0
|
Pending |
0
|
Skipped |
0
|
Passing |
4
|
Upgrade your plan to view test results. | |
View all changes introduced in this branch ↗︎ |
… bigquery - Seed existing bigquery table with employee record - Updated employee dataset reference direction - Confirm employee record exists before running DSR and confirm it's been removed afterwards
def generate_delete(self, row: Row, client: Engine) -> Optional[Delete]: | ||
"""Returns a SQLAlchemy DELETE statement for BigQuery. Does not actually execute the delete statement. | ||
|
||
Used when a collection-level masking override is present and the masking strategy is DELETE. | ||
""" | ||
non_empty_primary_keys: Dict[str, Field] = filter_nonempty_values( | ||
{ | ||
fpath.string_path: fld.cast(row[fpath.string_path]) | ||
for fpath, fld in self.primary_key_field_paths.items() | ||
if fpath.string_path in row | ||
} | ||
) | ||
|
||
valid = len(non_empty_primary_keys) > 0 | ||
if not valid: | ||
logger.warning( | ||
"There is not enough data to generate a valid DELETE statement for {}", | ||
self.node.address, | ||
) | ||
return None | ||
|
||
table = Table( | ||
self.node.address.collection, MetaData(bind=client), autoload=True | ||
) | ||
pk_clauses: List[ColumnElement] = [ | ||
getattr(table.c, k) == v for k, v in non_empty_primary_keys.items() | ||
] | ||
return table.delete().where(*pk_clauses) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
New method for Bigquery only to start to generate a DELETE statement for the relevant primary keys.
Lots of failing tests, on first glance a lot aren't related, but let me go through and double check them - increased failures in saas connector tests - I wonder if EDIT: SaaS Connectors were flaky tests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work! Tested manually and it looks good
# Conflicts: # CHANGELOG.md
# Conflicts: # CHANGELOG.md
# Conflicts: # CHANGELOG.md
Merging, I believe the failing tests are also on main |
fides Run #10128
Run Properties:
|
Project |
fides
|
Branch Review |
main
|
Run status |
Passed #10128
|
Run duration | 00m 38s |
Commit |
0e2db7c28d: Bigquery Row-Level Deletes + Erase After on Database Collections (#5293)
|
Committer | Dawn Pattison |
View all properties for this run ↗︎ |
Test results | |
---|---|
Failures |
0
|
Flaky |
0
|
Pending |
0
|
Skipped |
0
|
Passing |
4
|
Upgrade your plan to view test results. | |
View all changes introduced in this branch ↗︎ |
Closes #PROD-2744
Closes #PROD-2745
Needs fideslang ethyca/fideslang#17
Description Of Changes
erase_after
to work for database connectors not just saas connectors.Code Changes
erase_after
functionality to be incorporated into the graph if specified for a dataset connector, not just a saas connector. This lets customers specify they want a specific node to run after another if necessary, while there is no intelligent ordering for erasure deletes.Caution
erase_after
functionalitymasking_strategy_overrrides
at the collection level are ignored for all but bigquery integrationsSteps to Confirm
Pre-Merge Checklist
CHANGELOG.md
main
downgrade()
migration is correct and works