-
Notifications
You must be signed in to change notification settings - Fork 1.7k
This issue was moved to a discussion.
You can continue the conversation there. Go to discussion →
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Model freshness should be a first-class citizen, like source freshness #3862
Comments
@joellabes Thanks for the thorough writeup! I think there are two different issues at play here:
I think model "freshness" needs to be a concept synthesized out of two discrete things:
Put another way, the latency of
This feels conceptually similar to (if also the inverse of) the Some databases offer this metadata natively (e.g. Snowflake offers |
This issue was moved to a discussion.
You can continue the conversation there. Go to discussion →
Describe the feature
We have a model which is built on top of ~50 source tables, each representing a single month of usage. Instead of adding each of those input tables as sources, we use the
dbt_utils.get_relations_by_pattern
anddbt_utils.union_relations
macros.This means that the model springs out fully formed instead of having source lineage.
The best/only way to do freshness checks on a models is a
dbt_utils.recency
test, but that can be swallowed up in the background noise of daily errors and isn't treated as a freshness issue in metadata tiles. We just found that Fivetran hadn't been syncing the September table, a week in 😬. If it had been a freshness issue, we would have been all over it much earlier.We could define
utcs_base_unioned
as a source and put freshness expectations on it, but it would lead to a funny looking DAG (to say nothing of the fact that each user has their own copy of that model, unlike sources which are a shared resource).Instead, I want to be able to do is define freshness rules for a model that behave in the same way as sources' freshness. I doubt they'd be used outside of dynamically generated models like this, but it's a use case that slips through the cracks right now.
Describe alternatives you've considered
As above
Additional context
I know you just rationalised the name of the task in #3554. It'd be confusing to have non-sources handled under the source command, but idk what to do instead.
Who will this benefit?
Data freshness passed
message when their data is actually a week staleAre you interested in contributing this feature?
Maybe?
The text was updated successfully, but these errors were encountered: