-
Notifications
You must be signed in to change notification settings - Fork 960
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Adding ability to include conditions along with node and relation types in the neo4j staleness removal task #1464
Conversation
…es in the neo4j staleness removal task Signed-off-by: Kristen Armes <karmes@lyft.com>
Congratulations on your first Pull Request and welcome to Amundsen community! If you have any issues or are unsure about any anything please check our Contribution Guide (https://github.com/amundsen-io/amundsen/blob/main/CONTRIBUTING.md) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome. Pretty much all my comments are nits.
Signed-off-by: Kristen Armes <karmes@lyft.com>
@@ -64,6 +70,32 @@ class Neo4jStalenessRemovalTask(Task): | |||
|
|||
""" | |||
|
|||
delete_stale_nodes_statement = textwrap.dedent(""" | |||
MATCH (target:{{type}}) | |||
{staleness_condition}{{extra_condition}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
an we make staleness_condition one of {{
as well so it has to be named when formatting. Just makes things more uniform IMO. Same points about the other statements.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed about the uniformity but the reason I put the single brace around staleness_condition
(and the reason it was single braces before it was named) is bc each time .format
is called on the statement it removes one set of curly braces, so since we have the _decorate_staleness
function that handles populating the staleness condition first, and then after that the target type
and extra_condition
are left with a single {}
and are populated in each of the other functions after. Do you think this is alright, or an alternative would be to bring all the formatting into the _decorate_staleness
function if that is preferable?
extra_condition=stale_extra_condition), | ||
param_dict={MARKER_VAR_NAME: self.marker}) | ||
|
||
total_record_value = next(iter(total_records), None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if you just return the count from the statement instead of the tuple, i believe you won't have to do this weird thing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The _execute_cypher_query
function returns a Iterable[Dict[str, Any]]
type which is the reason for this, the same next iter thing is being done in _batch_delete
when only a count is returned (although I didn't realize it when I wrote this part)
Signed-off-by: Kristen Armes <karmes@lyft.com>
Signed-off-by: Kristen Armes <karmes@lyft.com>
Signed-off-by: Kristen Armes <karmes@lyft.com>
Awesome work, congrats on your first merged pull request! |
…on types in the neo4j staleness removal task (amundsen-io#1464) * Adding ability to include conditions along with node and relation types in the neo4j staleness removal task Signed-off-by: Kristen Armes <karmes@lyft.com> * Addressing PR comments Signed-off-by: Kristen Armes <karmes@lyft.com> * Fixing up query templates Signed-off-by: Kristen Armes <karmes@lyft.com> * Fixing type issue in test Signed-off-by: Kristen Armes <karmes@lyft.com> * Fixing formatting requirements Signed-off-by: Kristen Armes <karmes@lyft.com> Signed-off-by: Ozan Dogrultan <ozan.dogrultan@deliveryhero.com>
…on types in the neo4j staleness removal task (amundsen-io#1464) * Adding ability to include conditions along with node and relation types in the neo4j staleness removal task Signed-off-by: Kristen Armes <karmes@lyft.com> * Addressing PR comments Signed-off-by: Kristen Armes <karmes@lyft.com> * Fixing up query templates Signed-off-by: Kristen Armes <karmes@lyft.com> * Fixing type issue in test Signed-off-by: Kristen Armes <karmes@lyft.com> * Fixing formatting requirements Signed-off-by: Kristen Armes <karmes@lyft.com>
…on types in the neo4j staleness removal task (amundsen-io#1464) * Adding ability to include conditions along with node and relation types in the neo4j staleness removal task Signed-off-by: Kristen Armes <karmes@lyft.com> * Addressing PR comments Signed-off-by: Kristen Armes <karmes@lyft.com> * Fixing up query templates Signed-off-by: Kristen Armes <karmes@lyft.com> * Fixing type issue in test Signed-off-by: Kristen Armes <karmes@lyft.com> * Fixing formatting requirements Signed-off-by: Kristen Armes <karmes@lyft.com>
Signed-off-by: Kristen Armes karmes@lyft.com
Summary of Changes
Currently, the neo4j staleness removal task can only remove stale nodes and/or relations based on specified types. This new functionality allows users to specify a condition along with the type. To do this, you can specify the inputs to be a list of
TargetWithCondition
objects that each define a target type and a condition. Only stale nodes or relations of that type and that meet the condition will be removed when using this type of input.For example, you could add a condition that you only want to remove stale nodes of type Table that have some kind of relation to a node of type Column, rather than deleting all stale Table nodes.
This change is backwards compatible so users may continue to only specify the type without conditions for deleting stale nodes or relations.
Tests
The tests have been updated in test_neo4j_staleness_removal_task.py along with the addition of two new tests: test_validation_statement_with_target_condition and test_delete_statement_with_target_condition. These test the validation and deletion of nodes and relations when target conditions are included.
Documentation
A description of the feature has been added to databuilder/README.md to describe what the feature does and how to use it.
CheckList
Make sure you have checked all steps below to ensure a timely review.