Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CT-2850] [Feature] Raise a warning when --state path == --target-path #8160

Closed
3 tasks done
dbeatty10 opened this issue Jul 19, 2023 · 3 comments · Fixed by #8638
Closed
3 tasks done

[CT-2850] [Feature] Raise a warning when --state path == --target-path #8160

dbeatty10 opened this issue Jul 19, 2023 · 3 comments · Fixed by #8638
Assignees
Labels
enhancement New feature or request retry
Milestone

Comments

@dbeatty10
Copy link
Contributor

dbeatty10 commented Jul 19, 2023

Is this your first time submitting a feature request?

  • I have read the expectations for open source contributors
  • I have searched the existing issues, and I could not find an existing issue for this feature
  • I am requesting a straightforward extension of existing dbt functionality, rather than a Big Idea better suited to a discussion

Describe the feature

Mentioned by @aranke here and here:

Something else to note here is that you cannot use the same target directory for both the source and target

You get a very weird error that source and target states are the same, and so it's a no-op.

I ran into this during local development, so think this warning needs to be surfaced loudly.

This happens because

dbt now (over)writes manifest.json during parsing.
[…]
during the subsequent run (step 4) in the reproduction case, it overwrites target/manifest.json, then reads it back in to detect changes, and finds no changes.

from @jtcohen6 in #7790 (comment).

Describe alternatives you've considered

Simple solution that will work:

  1. Write output to tmp directory during dbt run
  2. Copy tmp directory to the output directory at the end of the dbt run

Who will this benefit?

Anyone using state in any form (including deferral, clone) reusing the same artifact directory.

Are you interested in contributing this feature?

Yes

Anything else?

There are 3 possible solutions here, from easiest to hardest:

  1. Raise a warning when the input and output artifact directories are the same
  2. Write to tmp directory as mentioned above
  3. Investigate root cause and fix underlying issue
@dbeatty10 dbeatty10 added enhancement New feature or request triage labels Jul 19, 2023
@github-actions github-actions bot changed the title [Feature] Improve the error message for dbt clone when source = destination [CT-2850] [Feature] Improve the error message for dbt clone when source = destination Jul 19, 2023
@dbeatty10 dbeatty10 removed the triage label Jul 19, 2023
@jtcohen6 jtcohen6 added this to the v1.6.x milestone Jul 20, 2023
@aranke aranke self-assigned this Aug 7, 2023
@graciegoheen
Copy link
Contributor

From refinement:

  • raise a warning or error (let's do what the warehouse would do) when the source manifest is the same as the target manifest

@jtcohen6
Copy link
Contributor

Lots of places we could try to catch this. The most obvious to me (but it's Jinja) is inside the clone materialization, as soon as we know that target_relation == defer_relation:

-- as a general rule, data platforms that can clone tables can also do atomic 'create or replace'
{% call statement('main') %}
{{ create_or_replace_clone(target_relation, defer_relation) }}
{% endcall %}

@jtcohen6 jtcohen6 changed the title [CT-2850] [Feature] Improve the error message for dbt clone when source = destination [CT-2850] [Feature] Raise a warning when --state path == --target-path Sep 6, 2023
@jtcohen6
Copy link
Contributor

jtcohen6 commented Sep 6, 2023

I retitled the issue based on further conversation in:

I think we want a warning in the more general case where --state path == --target-path. The only legitimate use case for this is dbt retry, and even then it's fair to warn users that this is not idempotent — subsequent retries will be different, based on whatever was the last retryable invocation.

Separately, I do think we want create_or_replace_clone to debug log + no-op if target_relation == defer_relation, rather than trying to run either

create or replace table my_db.my_schema.my_table clone my_db.my_schema.my_table;
create or replace view my_db.my_schema.my_table as select * from my_db.my_schema.my_table;

Acceptance criteria:

  • Raise a warning whenever --state path == --target-path
  • Update logic of create_or_replace_clone as described

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request retry
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants