Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add setting to allow turning of persistence for tasks globally #15881

Merged
merged 4 commits into from
Oct 31, 2024

Conversation

desertaxle
Copy link
Member

Adds a setting tasks.default_persist_result that controls the default value for persist_result for tasks only. This will allow users to globally turn off result persistence for tasks only. When this setting is enabled, it overrides the normal inheritance of result persistence from parent runs.

Closes #15401

Copy link

codspeed-hq bot commented Oct 31, 2024

CodSpeed Performance Report

Merging #15881 will not alter performance

Comparing default-for-flows-or-tasks (360f8c0) with main (525d842)

Summary

✅ 3 untouched benchmarks

Copy link
Member

@cicdw cicdw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Questions more for understanding than anything else

@@ -209,7 +219,7 @@ def should_persist_result() -> bool:
if flow_run_context is not None:
return flow_run_context.persist_result

return PREFECT_RESULTS_PERSIST_BY_DEFAULT.value()
return get_default_persist_setting()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it looks like this utility will return True if the flow run has persistence turned on, regardless of the task-specific setting; what am I missing?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will, but we check the new setting before calling this when setting up the task run context like this:

if self.task.persist_result is not None:
    persist_result = self.task.persist_result
elif settings.tasks.default_persist_result is not None:
    persist_result = settings.tasks.default_persist_result
else:
    persist_result = should_persist_result()

Another implementation I considered was to include that check in this util, but the utility would need to know if a task was the one calling the function via a kwarg. This question makes me think that might be a better implementation so that all the determination logic is collocated and easier to find.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing that could help is maybe renaming the utility to something like fallback_default or something to suggest it isn't necessarily the source of truth. Minor but I think that's one piece of why I got confused (also the test, which is using this utility as if it is the source of truth for the setting which makes semantic sense given its name)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tricky part is that should_persist_result is the source of truth for all cases except when we're about to set up a TaskRunContext, which is where we need a small escape hatch. Let me see what I can do to clean this up.


persist_result = foo()

assert persist_result is False
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is the value of should_persist_result actually equivalent to whether the engine will persist the result? the conditional logic makes me think otherwise

Copy link
Member

@cicdw cicdw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can probably centralize / rename a few of these utilities, but in the meantime the test coverage allows us to take that up as a lower priority later 👍

@desertaxle desertaxle merged commit 0be914a into main Oct 31, 2024
36 checks passed
@desertaxle desertaxle deleted the default-for-flows-or-tasks branch October 31, 2024 18:44
frances720 pushed a commit to Promptless/prefect-test that referenced this pull request Nov 5, 2024
frances720 pushed a commit to Promptless/prefect-test that referenced this pull request Nov 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs enhancement An improvement of an existing feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Allow specifying default result persistence separately for tasks and flows
2 participants