-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix for "table dropped by concurrent query" on Redshift #825
Conversation
- add integration tests for Redshift
@@ -162,7 +162,8 @@ def drop_relation(cls, profile, project_cfg, relation, model_name=None): | |||
|
|||
sql = 'drop {} if exists {} cascade'.format(relation.type, relation) | |||
|
|||
connection, cursor = cls.add_query(profile, sql, model_name) | |||
connection, cursor = cls.add_query(profile, sql, model_name, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had to make this change because dbt was previously adding a begin
statement before the drop
. Since the drop
now happens after the transaction, there was no corresponding commit
, and the transaction containing the drop
was rolled back.
I checked the other places where dbt uses this method, and I believe this change should be safe, but I'm very open to improving how dbt handles transactions here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there's one failure mode here that i'm concerned about, can you take a look at that?
looks like snowflake is passing 👍
{{ adapter.rename_relation(intermediate_relation, target_relation) }} | ||
{%- endif %} | ||
|
||
-- `COMMIT` happens here | ||
{{ adapter.commit() }} | ||
{{ drop_relation_if_exists(backup_relation) }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would this logic fail if the backup relation wasn't deleted on the previous run? i.e.:
run 1:
- dbt renames target to backup, commit succeeds, but then it fails to drop backup here
run 2: - ERROR: relation already exists?
i'd imagine you could fix this by also dropping the backup relation at the start of the run, if it exists.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
really good point. We do something similar for the __dbt_tmp relation -- i will update!
commands = /bin/bash -c '{envpython} $(which nosetests) -v -a type=redshift {posargs} --with-coverage --cover-branches --cover-html --cover-html-dir=htmlcov test/integration/*' | ||
deps = | ||
-r{toxinidir}/requirements.txt | ||
-r{toxinidir}/dev_requirements.txt |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
YESSSS
💯 |
A test was added to reliably reproduce this error. Before the materialization logic was updated, the result looked like: