-
Notifications
You must be signed in to change notification settings - Fork 186
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multi-query statements: fix rows_affected + query comments #153
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Failing tests are the same ones that are failing nightly on main
right now: #154
# don't apply the query comment here | ||
# it will be applied after ';' queries are split |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This execute
is identical to dbt.adapters.sql.connections.execute
, except that we're going to wait to _add_query_comment
until after the queries have been split
# Let their wish be granted! | ||
# This also has the effect of ignoring "commit" in the RunResult for this model | ||
# https://github.com/dbt-labs/dbt-snowflake/issues/147 | ||
if individual_query.lower() == "begin;": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Feels a bit jank, but actually:
- These are generally being added by our own internal macro,
snowflake_dml_explicit_transaction
, and we know exactly what it's adding - We've already stripped whitespace above
- We use the Snowflake connector's official
split_statements
, which seems to return statements with the;
included. - I'm actually not sure why we're not also using their official
remove_comments
argument tosplit_statements
, rather than our own janky logic here? Most recent work there was Fix bug #2731 on stripping query comments for snowflake dbt-core#2974 - Worst case, we don't catch these as "real"
begin
+commit
, just send them along as normal queries, and store their results accordingly
6043f3e
to
5cc2cf7
Compare
Failing tests are the same ones that are failing nightly on |
5cae71f
to
4e73187
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just did a test on an incremental model and:
- the CLI returned the number of rows inserted when it ran in incremental mode
- the query comments showed on the 3 queries:
- begin;
- insert into ...;
- commit;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me! I wish we didn't mutate our connection object with our queries so we could get a couple of nice clean unit tests in here, but that part isn't your fault and isn't worth changing in this PR.
Let's add a changelog entry and make sure we note that this change will change the logs for people trying to compare metadata between runs of prior versions.
c58b01e
to
caebedd
Compare
Hm. Will try closing and reopening to trigger Snyk, before giving up and finding a way to merge anyway |
resolves #140
resolves #147
Description
Context: Snowflake's Python client can't handle multi-query statements, i.e.
As such,
dbt-snowflake
needs to split these up and send them off individually.These queries remain tightly "batched," because they use the same threaded connection. I believe this represents the surest way we have to run multiple statements within the same transaction. As such, and on Snowflake's recommendation, we wrap DML statements in explicit transactions to ensure they succeed:
The problems:
/* {...} */
) only gets applied once, to the very first querycommit
, rather than the main event (insert
)The solutions:
begin;
+commit;
to use dbt's built-inadd_begin_query
+add_commit_query
(even thoughdbt-snowflake
disables these by default). This has the effect of ignoring the results from these queries, so the last non-commit
query has its result stored insteadindividual_query
, after they've been split on;
. Needed to do this anyway, so we can properly catchbegin;
andcommit;
— why not fix this issue in the meantime, too!Tests
I added a test case for incremental models +
rows_affected
(by extending the "basic" incremental test). I haven't added a test case to ensure that all queries contain the query comment, but I imagine it wouldn't be too tricky withrun_dbt_and_capture
.Running locally
Note that second
SUCCESS 3
(fixed), instead ofSUCCESS 1
(current)!In
logs/dbt.log
, query comments everywhere:Checklist
CHANGELOG.md
and added information about my change to the "dbt-snowflake next" section.