Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CT-3224] [Regression] DBT 1.5.8 introduces breaking changes to incrementally materialized models #8857

Closed
2 tasks done
epgui opened this issue Oct 13, 2023 · 9 comments
Closed
2 tasks done

Comments

@epgui
Copy link

epgui commented Oct 13, 2023

Is this a regression in a recent version of dbt-core?

  • I believe this is a regression in dbt-core functionality
  • I have searched the existing issues, and I could not find an existing issue for this regression

Current Behavior

Prior to 1.5.8 (<= 1.5.7), incremental models would accept implicit type casts such as VARCHAR(n) -> TEXT (or the other way around). In other words, prior to 1.5.8, if the model's DDL / contract specification was TEXT, then source data of type VARCHAR(n) would get incrementally added to the table without issue and without any need for explicitly casting (ie.: my_column::TEXT).

With dbt-cloud not allowing users to pin down a minor version of DBT, this caused models to suddenly stop working for us.

Expected/Previous Behavior

Prior to 1.5.8 (<= 1.5.7), incremental models would accept implicit type casts such as VARCHAR(n) -> TEXT.

Steps To Reproduce

See problem description in "Current behaviour". This should be trivially reproducible.

Relevant log output

20:42:22                     Source columns not in target: []
20:42:22                     Target columns not in source: []
20:42:22                     New column types: [{'column_name': 'REDACTED', 'new_type': 'character varying(256)'}, {'column_name': 'REDACTED', 'new_type': 'character varying(256)'}, {'column_name': 'REDACTED', 'new_type': 'character varying(256)'}, {'column_name': 'REDACTED', 'new_type': 'character varying(256)'}, {'column_name': 'REDACTED', 'new_type': 'character varying(256)'}, {'column_name': 'REDACTED', 'new_type': 'character varying(256)'}, {'column_name': 'REDACTED', 'new_type': 'character varying(256)'}, {'column_name': 'REDACTED', 'new_type': 'character varying(256)'}, {'column_name': 'REDACTED', 'new_type': 'character varying(256)'}, {'column_name': 'REDACTED', 'new_type': 'character varying(256)'}, {'column_name': 'REDACTED', 'new_type': 'character varying(256)'}, {'column_name': 'REDACTED', 'new_type': 'character varying(256)'}, {'column_name': 'REDACTED', 'new_type': 'character varying(256)'}, {'column_name': 'REDACTED', 'new_type': 'character varying(256)'}, {'column_name': 'REDACTED', 'new_type': 'character varying(256)'}, {'column_name': 'REDACTED', 'new_type': 'character varying(256)'}, {'column_name': 'REDACTED', 'new_type': 'character varying(256)'}, {'column_name': 'REDACTED', 'new_type': 'character varying(256)'}, {'column_name': 'REDACTED', 'new_type': 'character varying(256)'}, {'column_name': 'REDACTED', 'new_type': 'character varying(256)'}, {'column_name': 'REDACTED', 'new_type': 'character varying(256)'}, {'column_name': 'REDACTED', 'new_type': 'character varying(256)'}, {'column_name': 'REDACTED', 'new_type': 'character varying(256)'}, {'column_name': 'REDACTED', 'new_type': 'character varying(256)'}, {'column_name': 'REDACTED', 'new_type': 'character varying(256)'}, {'column_name': 'REDACTED', 'new_type': 'character varying(256)'}, {'column_name': 'REDACTED', 'new_type': 'character varying(256)'}, {'column_name': 'REDACTED', 'new_type': 'character varying(256)'}, {'column_name': 'REDACTED', 'new_type': 'character varying(256)'}, {'column_name': 'REDACTED', 'new_type': 'character varying(256)'}, {'column_name': 'REDACTED', 'new_type': 'character varying(256)'}, {'column_name': 'REDACTED', 'new_type': 'character varying(256)'}, {'column_name': 'REDACTED', 'new_type': 'character varying(256)'}, {'column_name': 'REDACTED', 'new_type': 'character varying(256)'}, {'column_name': 'REDACTED', 'new_type': 'character varying(256)'}, {'column_name': 'REDACTED', 'new_type': 'character varying(256)'}, {'column_name': 'REDACTED', 'new_type': 'character varying(256)'}, {'column_name': 'REDACTED', 'new_type': 'character varying(256)'}, {'column_name': 'REDACTED', 'new_type': 'character varying(256)'}, {'column_name': 'REDACTED', 'new_type': 'character varying(256)'}, {'column_name': 'REDACTED', 'new_type': 'character varying(256)'}, {'column_name': 'REDACTED', 'new_type': 'character varying(256)'}, {'column_name': 'REDACTED', 'new_type': 'character varying(256)'}, {'column_name': 'REDACTED', 'new_type': 'character varying(256)'}, {'column_name': 'REDACTED', 'new_type': 'character varying(256)'}, {'column_name': 'REDACTED', 'new_type': 'character varying(256)'}]

Environment

- OS: any (dbt-cloud)
- Python: any (dbt-cloud)
- dbt (working version): 1.5.7
- dbt (regression version): 1.5.8

Which database adapter are you using with dbt?

postgres

Additional Context

No response

@epgui epgui added bug Something isn't working regression triage labels Oct 13, 2023
@github-actions github-actions bot changed the title [Regression] DBT 1.5.8 introduces breaking changes to incrementally materialized models [CT-3224] [Regression] DBT 1.5.8 introduces breaking changes to incrementally materialized models Oct 13, 2023
@graciegoheen graciegoheen added the Highest Severity critical bug that must be resolved immediately label Oct 13, 2023
@jtcohen6
Copy link
Contributor

@epgui Thanks for opening! I am guessing that the root cause here must have been the fix to resolve this bug:

You could try "reverting" that fix, by adding an override of the create_table_as macro without this line changed, to see if it resolves the issue for you.

@jtcohen6
Copy link
Contributor

jtcohen6 commented Oct 13, 2023

After looking a bit more closely at that change, I believe it has a larger scope than the effect desired.

When we had previously talked about the scope of the fix for that bug (comment here), I'd said something like:

  • Ideal: don't template any foreign-key constraints into create temporary table DDL statement, which aren't supported when creating temp tables
  • Next-best: don't template any constraints into create temporary table DDL statement (the constraints will be applied on insert), but do template the column names + data types (so that type coercion still happens there)
  • Acceptable [per , though it doesn't resolve this issue]: don't include column schema (names, data types, constraints) in create temporary table DDL statement — meaning the varchar <> text coercion won't happen

Whereas, I believe the change implemented in #8768 also disables the "pre-flight" contract check on an incremental builds of incremental models, including the "pre-flight" check in get_assert_columns_equivalent. If you have on_schema_change configured to evolve the schema accordingly, you'll end up with an output that hasn't been checked against the yaml contract. That's not right.

This change affects users of dbt-postgres, but I don't believe it affects any users of Redshift/Snowflake/BigQuery/Databricks.

@emmyoop
Copy link
Member

emmyoop commented Oct 23, 2023

@epgui I'm having trouble reproducing the exact behavior you're experiencing. Would you be able to provide a simple reproduction case similar to what Jeremy provided in this issue? That will help me track down exactly what's going on in your project. Thank you!

@emmyoop
Copy link
Member

emmyoop commented Oct 25, 2023

I opened another issue (#8896) for the related issue described by @jtcohen6.

@emmyoop emmyoop removed their assignment Oct 25, 2023
@epgui
Copy link
Author

epgui commented Nov 2, 2023

Twofold apology:

  • First, Github notifications don't really work for me, so I was not alerted to the fact that further input was requested from me.
  • Second, this is an issue I've come across at work, and while it's not ideal we have an ad-hoc mitigation/workaround in place. I don't know if I will have time to look into this in greater detail.

I regret that I can't provide much more information at this time, and I hope my raising the issue was at least somewhat helpful.

@dbeatty10
Copy link
Contributor

No worries @epgui.

Can you describe the ad-hoc mitigation/workaround? That might help us figure out the characteristics causing the issue you experienced.

@dbeatty10 dbeatty10 removed the Highest Severity critical bug that must be resolved immediately label Jan 12, 2024
@dbeatty10
Copy link
Contributor

Since we are unable to reproduce this, I'm going to close this as "can't repro".

If anyone else experiences this and can give us steps to reproduce (similar to here), please share them here or in a new issue 🙏

@dbeatty10 dbeatty10 closed this as not planned Won't fix, can't repro, duplicate, stale Jan 12, 2024
@epgui
Copy link
Author

epgui commented Jan 12, 2024

No worries @epgui.

Can you describe the ad-hoc mitigation/workaround? That might help us figure out the characteristics causing the issue you experienced.

It is just as described in the original report: we must explicitly cast every VARCHAR(n) to TEXT. Additionally, we had to drop the entire table and rebuild it with these explicit casts in order to get past the error.

@dbeatty10
Copy link
Contributor

@epgui we are looking for a detailed code example we could try running ourselves rather than a written description without code examples.

We don't have enough information right now to reproduce, so we can't re-open this issue until then.

Here is the type of issue report we are looking for.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants