Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CT-2643] [Regression] state:modified selection method not behaving correctly #7790

Closed
2 tasks done
serene-capybara opened this issue Jun 5, 2023 · 6 comments
Closed
2 tasks done
Labels
bug Something isn't working regression user docs [docs.getdbt.com] Needs better documentation wontfix Not a bug or out of scope for dbt-core

Comments

@serene-capybara
Copy link

Is this a regression in a recent version of dbt-core?

  • I believe this is a regression in dbt-core functionality
  • I have searched the existing issues, and I could not find an existing issue for this regression

Current Behavior

dbt commands selecting modified models fails to detect any nodes.

Expected/Previous Behavior

Modified nodes should be detected when selecting on modified state.

Steps To Reproduce

  1. dbt clean && dbt deps
  2. dbt [run|build|compile] --target-path target
  3. Make some non-whitespace changes to a model
  4. dbt ls --select state:modified --state target

Relevant log output

18:45:55  Running with dbt=1.5.1
18:45:55  Found 40 models, 134 tests, 0 snapshots, 0 analyses, 429 macros, 0 operations, 27 seed files, 34 sources, 0 exposures, 0 metrics, 0 groups
18:45:55  The selection criterion 'state:modified' does not match any nodes
18:45:55  No nodes selected!


### Environment

```markdown
- OS:Ubuntu 22.04.2 LTS
- Python: 3.10.6
- dbt (working version): 1.4.x
- dbt (regression version): 1.5.x

Which database adapter are you using with dbt?

postgres

Additional Context

No response

@serene-capybara serene-capybara added bug Something isn't working regression triage labels Jun 5, 2023
@github-actions github-actions bot changed the title [Regression] state:modified selection method not behaving correctly [CT-2643] [Regression] state:modified selection method not behaving correctly Jun 5, 2023
@jtcohen6 jtcohen6 added wontfix Not a bug or out of scope for dbt-core and removed triage labels Jun 6, 2023
@jtcohen6
Copy link
Contributor

jtcohen6 commented Jun 6, 2023

@serene-capybara I know what's happened here: dbt now (over)writes manifest.json during parsing.

if write and ctx.obj["flags"].write_json:
write_manifest(manifest, ctx.obj["runtime_config"].project_target_path)

This is more consistent across the board, in terms of when the manifest becomes available to dbt and to end users. But it also means that, during the subsequent run (step 4) in the reproduction case, it overwrites target/manifest.json, then reads it back in to detect changes, and finds no changes.

I'd recommend updating your workflow in one of the following ways (ranked in my personal order of preference):

  • Move the manifest from the target/ folder to a dedicated folder (e.g. state/) between step (2) and step (4): mkdir state && mv target/manifest.json state/manifest.json
  • Write the manifest to a different --target-path in step (2) or step (4)
  • Pass --no-write-json flag during step (4): dbt --no-write-json ls --select state:modified --state target

I'm going to close this in the meantime, as not a bug/regression, even if it means that this specific workflow will need to be modified while upgrading from v1.4 → v1.5.

@jtcohen6 jtcohen6 closed this as not planned Won't fix, can't repro, duplicate, stale Jun 6, 2023
@serene-capybara
Copy link
Author

@jtcohen6 Thanks for the reply. I thought this was a pretty run-of-the-mill workflow, (Steps 3 and 4, anyway).

Is there a more typical development / Slim CI pattern for selecting modified nodes that does not require this temp folder workaround? More to the point - is there a practical advantage to updating the model checksum during parse, as opposed to later in the pipeline?

Anyway, I appreciate the feedback; this is enough to fix my CI, and I suppose I can just make a script for development. However, I think this change should be captured in one of these doc pages/sections:

@jtcohen6 jtcohen6 added the user docs [docs.getdbt.com] Needs better documentation label Jun 6, 2023
@jtcohen6
Copy link
Contributor

jtcohen6 commented Jun 6, 2023

Fair point on updating docs!

@voroninman
Copy link

I must admit that I was caught by surprise with this change. Referring to it as 'not a bug/regression' might be a bit of a stretch, considering its impact on state selectors for existing setups.

@jtcohen6
Copy link
Contributor

jtcohen6 commented Sep 5, 2023

@voroninman Heard! We had some more conversation about this, in follow-ups to the dbt clone command:

I don't foresee us changing this behavior, but we could do a better job of avoiding unpleasant surprises. What do you think about a warning message whenever a user has set their --state path and --target-path to be the same? Especially if they're also using state-informed features (--defer, state:modified, etc). "This isn't recommended, not idempotent, and may not work the way you expect."

The one case where it feels legitimate to have state == target is dbt retry. There, the default thing I want to be retrying is whatever I ran last. That it will vary (be overwritten) from one invocation to the next; it's not idempotent, and that doesn't feel like a surprise.

@voroninman
Copy link

A warning would suffice. If I start seeing something like "WARNING: The state selector won't match anything when --state and --target-path point to the same path. Read more https://docs.getdbt.com/reference/previous-state", it would be ideal IMO.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working regression user docs [docs.getdbt.com] Needs better documentation wontfix Not a bug or out of scope for dbt-core
Projects
None yet
Development

No branches or pull requests

3 participants