-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix the way vars and --full-refresh interact with retry. #9285
Conversation
Thank you for your pull request! We could not find a changelog entry for this change. For details on how to document a change, see the contributing guide. |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #9285 +/- ##
==========================================
- Coverage 86.70% 86.64% -0.06%
==========================================
Files 179 179
Lines 26648 26659 +11
==========================================
- Hits 23104 23099 -5
- Misses 3544 3560 +16
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change looks good to me! Thanks for the great tests here. I have one question about the vars
behavior which might be a product question.
I have an alternative solution which is instead of do the extra logic in flag, we do not do parsing when initializing the retry task, do all param consolidation before we load the manifest. Then we overwrite the global flags with the retry flags we construct, then do parsing. IMHO, this is slightly easier to reason about because all of the parameter resolution happens in one place instead of part of it in initial flag construction, the other part in retry task.
I have a very scrappy draft for it here if you want to take a look. But this requires more changes. I will leave it for you to decide.
|
||
if prev_vars: | ||
curr_vars = getattr(self, "VARS", {}) | ||
object.__setattr__(self, "VARS", {**prev_vars, **curr_vars}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we allow user to set current_vars during retry? And if it also exists in previous vars, which one should win?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like you choose to do the combine and later one win!
Note we do have this dict here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's check with @graciegoheen. What do you think, Grace?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The upside of combining is that if you only want to overwrite one var, you don't need to supply the all the vars.
The downside of combining is that if you want to stop supplying a var, you will have to set it to None or the default value of it if it is already defined since user supplied takes highest presidency.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My personal preference is the second because if a person is already supplying vars again, making it more explicit is better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work! A few nits, but otherwise LGTM!
previous_state = PreviousState( | ||
state_path=Path( | ||
ctx.params.get("state", "") or getattr(self, "TARGET_PATH") or "." | ||
), | ||
target_path=Path( | ||
getattr(self, "TARGET_PATH") | ||
or os.path.join(ctx.params["project_dir"], "target") | ||
), | ||
project_root=Path(ctx.params.get("project_dir") or "."), | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's keep this consistent with the logic in retry.py
here?
dbt-core/core/dbt/task/retry.py
Lines 67 to 71 in 3f7f7de
self.previous_state = PreviousState( | |
state_path=Path(state_path), | |
target_path=Path(self.config.target_path), | |
project_root=Path(self.config.project_root), | |
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
get rid of 'or "."'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It turns out that getting rid of the "." causes several previously-existing unit tests to fail. I'm not sure why, yet, but it has made me nervous about the regression risk around trying to push this out the door today. I just don't feel like we totally understand the behaviors of the system.
curr_vars = getattr(self, "VARS", {}) | ||
object.__setattr__(self, "VARS", {**prev_vars, **curr_vars}) | ||
|
||
if "full_refresh" in params_assigned_from_default and previous_state.results: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we put this in a list somewhere so we can add to it if a new condition comes up?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's leave a comment here that this is a "temporary partial patch"
} | ||
|
||
def test_retry(self, project): | ||
run_dbt(["run", "--vars", '{"myvar_a": "12", "myvar_b": "3 4"}'], expect_pass=False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: can myvar_b
be a string (like apple
) to make the failure case more explicit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Explain theory of test in comment.
run_dbt(["run", "--vars", '{"myvar_a": "12", "myvar_b": "3 4"}'], expect_pass=False) | ||
move("target", "state") | ||
results = run_dbt(["retry", "--state", "state", "--vars", '{"myvar_b": "34"}']) | ||
assert len(results) == 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assert something more specific about results.
previous_state = PreviousState( | ||
state_path=Path( | ||
ctx.params.get("state", "") or getattr(self, "TARGET_PATH") or "." | ||
), | ||
target_path=Path( | ||
getattr(self, "TARGET_PATH") | ||
or os.path.join(ctx.params["project_dir"], "target") | ||
), | ||
project_root=Path(ctx.params.get("project_dir") or "."), | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It turns out that getting rid of the "." causes several previously-existing unit tests to fail. I'm not sure why, yet, but it has made me nervous about the regression risk around trying to push this out the door today. I just don't feel like we totally understand the behaviors of the system.
@ChenyuLInx @peterallenwebb Confirming we can close this, since we're moving forward with the alternative solution? |
resolves #9112
Problem
Solution
Checklist