-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding Metric Helper Functions #5607
Conversation
Thank you for your pull request! We could not find a changelog entry for this change. For details on how to document a change, see the contributing guide. |
This looks really good, and the names look fine to me. I assume that a test of these functions would be special metric SQL? |
Hmmm. Honestly a test of this functionality could probably be compiling a few metric and ensuring that the results returned from these functions matches the output we expect. Let me see if I can take a stab of this in a pytest |
@gshank after some help from @ChenyuLInx around parsing classes, we were able to write a test that ensures the helper functions return what we would expect to return. If you give me the green light I'll add the changelog information and ensure that it passes all tests. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! You'll have to run pre-commit on it to fix the whitespace and black formatting.
@gshank I've run |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me!
I have one question but not blocking: with this getting in, I am assuming we are getting rid of some logic in Macros? should we do it also in this PR?
@ChenyuLInx once this merges I'll create a new version of dbt_metrics that builds off these helper functions. The benefit is that there isn't any functionality loss in this that would introduce downtime - I'd just be removing functionality from dbt_metrics and pulling the same logic from elsewhere. So its not time sensitive |
* Adding helper functions * adding bad test * use ResolvedMetricReference * adding pytest * adding changie updates * adding pre-commit changes Co-authored-by: Chenyu Li <chenyu.li@dbtlabs.com>
* Adding helper functions * adding bad test * use ResolvedMetricReference * adding pytest * adding changie updates * adding pre-commit changes Co-authored-by: Chenyu Li <chenyu.li@dbtlabs.com>
* Add docstrings to `contracts/graph/metrics.py` functions to document what they do Used [#5607](#5607) for context on what the functions should do. * Add typing to `reverse_dag_parsing` and update function to work on 1.6+ metrics * Add typing to `parent_metrics` and `parent_metrics_names` * Add typing to `base_metric_dependency` and `derived_metric_dependency` and update functions to work on 1.6+ metrics * Simplify implementations of `basic_metric_dependency` and `derived_metric_dependnecy` * Add typing to `ResolvedMetricReference` initialization * Add typing to `derived_metric_dependency_graph` * Simplify conditional controls in `ResolvedMetricReference` functions The functions in `ResolvedMetricReference` use `manifest.metric.get(...)` which will only return either a `Metric` or `None`, never a different node type. Thus we don't need to check that the returned metric is a metric. * Don't recurse on over `depends_on` for non-derived metrics in `reverse_dag_parsing` The function `reverse_dag_parsing` only cares about derived metrics, that is metrics that depend on other metrics. Metrics only depend on other metrics if they are one of the `DERIVED_METRICS` types. Thus doing a recursive call to `reverse_dag_parsing` for non `DERIVED_METRICS` types is unnecessary. Previously we were iterating over a metric's `depends_on` property regardless of whether the metric was a `DERIVED_METRICS` type. Now we only do this work if the metric is of a `DERIVED_METRICS` type. * Simplify `parent_metrics_names` by having it call `parent_metrics` * Unskip `TestMetricHelperFunctions.test_derived_metric` and update fixture setup * Add changie doc for metric helper function updates * Get manifest in `test_derived_metric` from the parse dbt_run invocation * Remove `Relation` a intiatlization attribute for `ResolvedMetricReference` * Add return typing to class `__` functions of `ResolvedMetricReference` * Move from `manifest.metrics.get` to `manifest.expect` in metric helpers Previously with `manifest.metrics.get` we were just skipping when `None` was returned. Getting `None` back was expected in that `parent_unique_id`s that didn't belong to metrics should return `None` when calling `manifest.metrics.get`, and these are fine to skip. However, there's an edgecase where a `parent_unique_id` is supposed to be a metric, but isn't found, thus returning `None`. How likely this edge case could get hit, I'm not sure, but it's a possible edge case. Using `manifest.metrics.get` it we can't actually tell if we're in the edge case or not. By moving to `manifest.expect` we get the error handling built in, and the only trade off is that we need to change our conditional to skip returned nodes that aren't metrics.
Resolves #5567
Description
This PR adds helper functions to
ResolvedMetricReference
to remove DAG parsing fromdbt_metrics
that we implemented in Jinja 🤮 . It is slightly different from the functions named in the issue in that the naming convention was slightly changed to match what appeared to be the format of making helper functions sound like attributes of the node.The functions are:
parent_metric_names
: returns a list of the all metrics upstreamreverse_dag_parsing
: function used in other helper function that reverse iterates through the dag and returns a list of dictionaries with the expression metric name and their depth. If an expression metric is built on top of two others, the list would have 3 values returned, where the metric in question has value 1 then +1 for each upstream metric.full_metric_dependency
: returns a unique list of all metrics upstream by converting the result ofparent_metric_names
to a set and then back to a list.base_metric_dependency
: returns a unique list of all non-expression metrics upstreamderived_metric_dependency
: returns a unique list of all expression metrics upstreamderived_metric_dependency_depth
: returns a list of dictionaries by building off ofreverse_dag_parsing
and setting the base value at 1.Checklist
changie new
to create a changelog entryI have not added tests or run changie new as these are draft changes.