Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Explicitly differentiate dbt unit tests from dbt data tests #382

Closed
dfsnow opened this issue Apr 16, 2024 · 5 comments · Fixed by #638
Closed

Explicitly differentiate dbt unit tests from dbt data tests #382

dfsnow opened this issue Apr 16, 2024 · 5 comments · Fixed by #638
Assignees
Labels
dbt Related to dbt (tests, docs, schema, etc)

Comments

@dfsnow
Copy link
Member

dfsnow commented Apr 16, 2024

dbt will soon include unit tests for complex SQL logic and queries. We should absolutely use these and should also attempt to differentiate tests on raw data (test_qc_ tests) from tests related to logic (tests for joins). We could do this via tagging.

We should also revisit our testing docs to clarify the ways our unit tests are written and run once we have a better idea of what they'll look like. See this comment for some ideas: #432 (comment)

@dfsnow dfsnow added the dbt Related to dbt (tests, docs, schema, etc) label Apr 16, 2024
@dfsnow dfsnow added this to the dbt operationalized/steady state milestone Apr 16, 2024
@jeancochrane
Copy link
Contributor

Note that we will also need to wait for dbt-athena to cut a release that supports dbt 1.8 before we can use unit tests. Luckily it seems like they plan to do this immediately, and the release is currently targeted for May 9.

@jeancochrane
Copy link
Contributor

Our old friend #238 bites us again: Unit tests use CTEs to template their fixtures into the compiled test query, meaning that until models with dots in their names are supported by dbt Core, we can't unit test any model that references another model ☹️

@dfsnow Do you think this is enough of a motivation to boost the priority of #238, or should we backburner unit tests for now?

@dfsnow
Copy link
Member Author

dfsnow commented May 21, 2024

Our old friend #238 bites us again: Unit tests use CTEs to template their fixtures into the compiled test query, meaning that until models with dots in their names are supported by dbt Core, we can't unit test any model that references another model ☹️

@dfsnow Do you think this is enough of a motivation to boost the priority of #238, or should we backburner unit tests for now?

Oof. The main problem here is that you can't have periods in CTE names right? That seems more like an Athena/dbt-athena problem to me. Perhaps there's a way to fix dbt-athena such that ephemeral models with dots are renamed.

Either way, let's check-in and see what else is on the list of upcoming stuff before we decide on priority.

@jeancochrane
Copy link
Contributor

jeancochrane commented Jun 7, 2024

Leaving a status update for the end of the week: I've got branches up for dbt-core and dbt-adapters that will fix the ephemeral model issue. (I also have the test-dbt-ephemeral-model-fix branch up to test these changes in this repo.) Next step is putting up the core/adapter PRs and then requesting review from core maintainers.

In the process of testing that fix, I discovered that there's another blocker preventing us from moving forward with unit tests. The dbt-athena adapter doesn't properly return array column definitions when populating test fixtures, causing the following error:

TYPE_MISMATCH: line 502:9: Unknown type: array(string).

The root cause here is that get_columns_in_relation uses AthenaColumn to parse column definitions, which is necessary due to Athena's difference between DDL and DML syntaxes (more details here) but AthenaColumn isn't yet configured to parse array types. See the ddl_data_type macro for an example of another function that handles this difference well.

Note that I'm also not sure why array types were being parsed in the particular case of the default_vw_pin_universe_class_no_special_chars test I was running; it's possible there's another bug causing incorrect array types to be set in certain cases, which may also need to be fixed. More investigation to come.

@jeancochrane
Copy link
Contributor

My bug fixes have officially been released as of v1.8.6, so this should be unblocked!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dbt Related to dbt (tests, docs, schema, etc)
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants