-
Notifications
You must be signed in to change notification settings - Fork 504
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Jmcneill/expression is true tweak #507
Jmcneill/expression is true tweak #507
Conversation
I accidentally based on main rather than next/minor. Happy to change this if needed. I guess that is the reason for a7f4f51 being included in the diff. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding notes for changes in comments
@@ -13,7 +13,7 @@ models: | |||
tests: | |||
- dbt_utils.at_least_one | |||
|
|||
- name: data_test_expression_is_true | |||
- name: data_test_expression_is_true_1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Migrate old seed to new one
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you name the seeds with a bit more clarity? perhaps data_test_expression_is_true
(i.e staying as-is) and data_test_expression_is_true_window_functions
or something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Happy to. If this is something that you don't like, this pattern of seed (*_1.csv
, *_2.csv
etc...) is present in a few other test files FYI
{%- else %} | ||
{{ column_name }} {{ expression }} | ||
{%- endif %} | ||
as _test_expression_passed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The one thing I'm a bit nervous about here is a field name collision.
A possible solution is to grab the column names from the given model and throw a compiler error if _test_expression_passed is one of the field names. Another is to not return *, but that means a lot of pain for people trying to fix tests that fail.
However, that seems a bit contrived to me. Another possibility is just make the field name more constructed (ie. maybe encode the expression in it etc). Would like to hear peoples thoughts :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I decided to catch this exception in: 68dc341.
Let me know if its a bit expensive to use get columns in relation to catch this error, or if people should be left to fend for themselves when seeing "ambiguous column name" errors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a fair concern! My gut feeling is that this is a reasonably obscure column name. Perhaps you could include the package name:
as _test_expression_passed | |
as _dbt_utils_test_expression_passed |
I don't think it's necessary to protect against the ambiguous column name, just selecting * should be OK.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do & i'll remove the get column in relation check.
Hey @joellabes - you may have missed this. Nudging as it's ready for review :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like where this is going 😍 ! A couple of nitpicks and comments below.
A side note: Since this will change the shape of the results table, which could be a breaking change for anyone who uses --store-failures
, this will likely come out in 0.9.0 as opposed to a patch release.
I'm OK with that though - we could exclude the new column with star
, but I think it's somewhere between not harmful
and helpful
to have the true/false result of the evaluation included.
{%- else %} | ||
{{ column_name }} {{ expression }} | ||
{%- endif %} | ||
as _test_expression_passed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a fair concern! My gut feeling is that this is a reasonably obscure column name. Perhaps you could include the package name:
as _test_expression_passed | |
as _dbt_utils_test_expression_passed |
I don't think it's necessary to protect against the ambiguous column name, just selecting * should be OK.
@@ -13,7 +13,7 @@ models: | |||
tests: | |||
- dbt_utils.at_least_one | |||
|
|||
- name: data_test_expression_is_true | |||
- name: data_test_expression_is_true_1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you name the seeds with a bit more clarity? perhaps data_test_expression_is_true
(i.e staying as-is) and data_test_expression_is_true_window_functions
or something?
…umn handler from PR comments
@joellabes thanks for the review! I've updated with your comments in 24ddca9
On this, I'm certainly on team "it's helpful". If i'm debugging a test, i'm always going to grab the compiled test code, run it and i'll want to see the result of some adjustments to pin down whats happening. In that case i'd like to see the result of the test to visualise how different things impact the result in something like a query editor - in my experience the fastest way to do this is with the test result as a column. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Beautiful! Thanks @jpmmcneill - keep an eye out for this to come in utils 0.9.0 in the next wee while. If you need it before then, you can always install the package using git and pointing to the next/minor
branch in the meantime.
Brill, thank you @joellabes! 🤜 🤛 |
* Update README.md * add some flexibility to expression_is_true execution plan and add a few new tests * catch duplicate field name exception when the expression_is_true test is invoked * expression is true - rename seeds, format sql and get rid of dupe column handler from PR comments Co-authored-by: Joel Labes <joel.labes@dbtlabs.com>
Resolves #490
This is a:
main
dev/
branchdev/
branchDescription & motivation
Adjust the way that the
expression_is_true
test gets evaluated (basically move it outside of a where clause).Issue context: #490
This allows for use of window functions at a column level, and maybe a bunch of other stuff.
Tests have been added window functionality.
Checklist
star()
source)limit_zero()
macro in place of the literal string:limit 0
dbt_utils.type_*
macros instead of explicit datatypes (e.g.dbt_utils.type_timestamp()
instead ofTIMESTAMP