Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix the way vars and --full-refresh interact with retry. #9285

Closed
wants to merge 5 commits into from

Conversation

peterallenwebb
Copy link
Contributor

@peterallenwebb peterallenwebb commented Dec 13, 2023

resolves #9112

Problem

Solution

Checklist

  • I have read the contributing guide and understand what's expected of me
  • I have run this code in development and it appears to resolve the stated issue
  • This PR includes tests, or tests are not required/relevant for this PR
  • This PR has no interface changes (e.g. macros, cli, logs, json artifacts, config files, adapter interface, etc) or this PR has already received feedback and approval from Product or DX
  • This PR includes type annotations for new and modified functions

@cla-bot cla-bot bot added the cla:yes label Dec 13, 2023
Copy link
Contributor

Thank you for your pull request! We could not find a changelog entry for this change. For details on how to document a change, see the contributing guide.

Copy link

codecov bot commented Dec 13, 2023

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (c2bc2f0) 86.70% compared to head (fc7acbe) 86.64%.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #9285      +/-   ##
==========================================
- Coverage   86.70%   86.64%   -0.06%     
==========================================
  Files         179      179              
  Lines       26648    26659      +11     
==========================================
- Hits        23104    23099       -5     
- Misses       3544     3560      +16     
Flag Coverage Δ
integration 83.52% <100.00%> (-0.12%) ⬇️
unit 65.08% <36.36%> (-0.02%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@peterallenwebb peterallenwebb changed the title Fix the way vars interact with retry. Fix the way vars and --full-refresh interact with retry. Dec 14, 2023
@peterallenwebb peterallenwebb marked this pull request as ready for review December 14, 2023 03:29
@peterallenwebb peterallenwebb requested a review from a team as a code owner December 14, 2023 03:29
Copy link
Contributor

@ChenyuLInx ChenyuLInx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change looks good to me! Thanks for the great tests here. I have one question about the vars behavior which might be a product question.

I have an alternative solution which is instead of do the extra logic in flag, we do not do parsing when initializing the retry task, do all param consolidation before we load the manifest. Then we overwrite the global flags with the retry flags we construct, then do parsing. IMHO, this is slightly easier to reason about because all of the parameter resolution happens in one place instead of part of it in initial flag construction, the other part in retry task.
I have a very scrappy draft for it here if you want to take a look. But this requires more changes. I will leave it for you to decide.


if prev_vars:
curr_vars = getattr(self, "VARS", {})
object.__setattr__(self, "VARS", {**prev_vars, **curr_vars})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we allow user to set current_vars during retry? And if it also exists in previous vars, which one should win?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like you choose to do the combine and later one win!
Note we do have this dict here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's check with @graciegoheen. What do you think, Grace?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The upside of combining is that if you only want to overwrite one var, you don't need to supply the all the vars.
The downside of combining is that if you want to stop supplying a var, you will have to set it to None or the default value of it if it is already defined since user supplied takes highest presidency.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My personal preference is the second because if a person is already supplying vars again, making it more explicit is better.

core/dbt/cli/flags.py Show resolved Hide resolved
tests/functional/retry/test_retry.py Show resolved Hide resolved
Copy link
Member

@aranke aranke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work! A few nits, but otherwise LGTM!

Comment on lines +246 to +255
previous_state = PreviousState(
state_path=Path(
ctx.params.get("state", "") or getattr(self, "TARGET_PATH") or "."
),
target_path=Path(
getattr(self, "TARGET_PATH")
or os.path.join(ctx.params["project_dir"], "target")
),
project_root=Path(ctx.params.get("project_dir") or "."),
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's keep this consistent with the logic in retry.py here?

self.previous_state = PreviousState(
state_path=Path(state_path),
target_path=Path(self.config.target_path),
project_root=Path(self.config.project_root),
)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

get rid of 'or "."'

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It turns out that getting rid of the "." causes several previously-existing unit tests to fail. I'm not sure why, yet, but it has made me nervous about the regression risk around trying to push this out the door today. I just don't feel like we totally understand the behaviors of the system.

curr_vars = getattr(self, "VARS", {})
object.__setattr__(self, "VARS", {**prev_vars, **curr_vars})

if "full_refresh" in params_assigned_from_default and previous_state.results:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we put this in a list somewhere so we can add to it if a new condition comes up?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's leave a comment here that this is a "temporary partial patch"

tests/functional/retry/test_retry.py Show resolved Hide resolved
}

def test_retry(self, project):
run_dbt(["run", "--vars", '{"myvar_a": "12", "myvar_b": "3 4"}'], expect_pass=False)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: can myvar_b be a string (like apple) to make the failure case more explicit?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Explain theory of test in comment.

tests/functional/retry/test_retry.py Show resolved Hide resolved
run_dbt(["run", "--vars", '{"myvar_a": "12", "myvar_b": "3 4"}'], expect_pass=False)
move("target", "state")
results = run_dbt(["retry", "--state", "state", "--vars", '{"myvar_b": "34"}'])
assert len(results) == 1
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assert something more specific about results.

Comment on lines +246 to +255
previous_state = PreviousState(
state_path=Path(
ctx.params.get("state", "") or getattr(self, "TARGET_PATH") or "."
),
target_path=Path(
getattr(self, "TARGET_PATH")
or os.path.join(ctx.params["project_dir"], "target")
),
project_root=Path(ctx.params.get("project_dir") or "."),
)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It turns out that getting rid of the "." causes several previously-existing unit tests to fail. I'm not sure why, yet, but it has made me nervous about the regression risk around trying to push this out the door today. I just don't feel like we totally understand the behaviors of the system.

@graciegoheen
Copy link
Contributor

@ChenyuLInx @peterallenwebb Confirming we can close this, since we're moving forward with the alternative solution?

@peterallenwebb peterallenwebb deleted the paw/fix-retry-vars branch May 16, 2024 14:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[CT-3401] [Bug] dbt retry fails to apply --full-refresh to downstream models and vars
4 participants