Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow expression is true to use window functions (and maybe more) #490

Closed
jpmmcneill opened this issue Feb 7, 2022 · 4 comments
Closed
Labels
enhancement New feature or request good first issue

Comments

@jpmmcneill
Copy link
Contributor

jpmmcneill commented Feb 7, 2022

Describe the feature

Currently, you can't have something like:

schema.yml

version:2
models:
  - name: my_model
    columns:
      - name: id
      - name: test_col_1
      - name: test_col_2
        tests:
        - dbt_utils.expression_is_true:
          - expression: "> sum(test_col_1) over (partition by id)"

due to how expression_is_true is implemented (namely, it evaluates the expression in the context of a where clause.

https://github.com/dbt-labs/dbt-utils/blob/51ed999a44fcc7f9f502be11e5f190f5bc84ba4b/macros/schema_tests/expression_is_true.sql#L16:L20

While I hate suggesting solutions in issues, this shouldn't be too hard to address. For example, a simple enough solution to this is to define a new column based on expression that would then get filtered out.

I haven't run into any other DB errors for this kind of problem, but this could certainly effect other functions that can't be evaluated inside where clauses.

Describe alternatives you've considered

A workaround is obviously to define a new field in the model being tested as a window function and then use that field in the expression is true. However, depending on the number of columns being tested, this can be very painful.

Who will this benefit?

People who don't want to have to write new fields in a table to use the expression is true test.

Are you interested in contributing this feature?

Definitely, this would be very quick in principle.

@jpmmcneill jpmmcneill added enhancement New feature or request triage labels Feb 7, 2022
@jpmmcneill jpmmcneill changed the title Allow expression is true to use window functions (and maybe more accidentally) Allow expression is true to use window functions (and maybe more) Feb 7, 2022
@jpmmcneill
Copy link
Contributor Author

jpmmcneill commented Feb 9, 2022

this obviously ignores the discussion over at #487

@joellabes
Copy link
Contributor

@jpmmcneill I like your proposed solution! I'd welcome a PR to that effect 👍

Bonus points if it also comes with a new integration test to check that it works correctly and doesn't break existing expressions.

The one thing that I've got in the back of my mind is that it needs to work as both a column-level and a model-level test. I think that would still be possible, but in your case you I think you might have to define your window function as a model-level test (not column-level) along these lines:

schema.yml

version:2
models:
  - name: my_model
    tests:
      - dbt_utils.expression_is_true:
        - expression: "col_2 > sum(test_col_1) over (partition by id)"
    columns:
      - name: id
      - name: test_col_1
      - name: test_col_2

I'll leave the specifics to you to dig into, but let me know if you need to noodle on anything.

@jpmmcneill
Copy link
Contributor Author

Hey @joellabes. I got around to changing this today, pr above ☝️. There is one thing I'm a bit apprehensive about - i've detailed that in a coment.

I'm unsure about what you meant by "doesn't break existing expressions". Do you want me to add a test that references the old implementation and checks that results are identical?

@dbeatty10
Copy link
Contributor

Resolved by #507

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request good first issue
Projects
None yet
Development

No branches or pull requests

3 participants