[Bug] Full refresh model config not respected when coming from a macro and partial parsing is used #9789

rlh1994 · 2024-03-21T16:40:21Z

Is this a new bug in dbt-core?

I believe this is a new bug in dbt-core
I have searched the existing issues, and I could not find an existing issue for this bug

Current Behavior

The situation is when I use a macro to define if a model should full refresh or not, despite the macro returning none the model still sometimes full refreshes without using the --full-refresh flag. This seems to be related to partial parsing as when this is disabled, or there is something that forces a full parse, this does not occur.

Expected Behavior

The incremental model to respect the macro value/full refresh flag.

Steps To Reproduce

New dbt project, add the following to a macro called my_macro.sql:

{% macro my_macro() %}
  {{ return(adapter.dispatch('my_macro', 'dbt_demo')()) }}
{% endmacro %}

{% macro default__my_macro() %}

  {% if flags.FULL_REFRESH == True %}
    {% set allow_refresh = true %}
    {{ log('Yes, refresh me!', info=True) }}
  {% else %}
     {% set allow_refresh =none %}
    {{ log('No, leave me be!', info=True) }}
  {% endif %}
    {{ log(flags.FULL_REFRESH, info=True) }}
    {{ log(allow_refresh, info=True) }}
  {{ return(allow_refresh) }}

{% endmacro %}

Add a model called my_model.sql with the following content:

{{
  config(
    materialized = 'incremental',
    full_refresh=my_macro(),
    )
}}

select 1 as test

Run the following sets of commands against a postgres target (although I have seen this issue against other targets as well:
```
dbt clean
dbt run --full-refresh
dbt run --full-refresh
```
Note in both run cases, the output is SELECT 1, and if you query the data you will see only 1 record - as expected so far.
Run the following sets of commands against a postgres target (although I have seen this issue against other targets as well:
```
dbt run
dbt run
```
You can query the table after each, and notice the output is still SELECT 1, and find that only a single record exists each time. You can also check the target run code and see it's using the create instead of an insert.
Do a dbt run --no-partial-parse and it will trigger an insert correctly.

Weirdly, the flag is correctly shown as False in the log, the return value is shown as None, so it's not clear why a full refresh is taking place.

Changing something in the model or the macro (possibly also the project file, didn't test that) seems to then kick it to correct itself and from then onwards it correctly respects the flag. Basically anything that means you can't do partial parsing anymore.

Relevant log output

~/Documents/junk/dbt_demo  dbt run --full-refresh
16:29:15  Running with dbt=1.7.2
16:29:16  Registered adapter: postgres=1.7.2
16:29:16  Unable to do partial parsing because saved manifest not found. Starting full parse.
16:29:16  Yes, refresh me!
16:29:16  True
16:29:16  True
16:29:16  Found 1 model, 0 sources, 0 exposures, 0 metrics, 403 macros, 0 groups, 0 semantic models
16:29:16  
16:29:16  Concurrency: 1 threads (target='postgres')
16:29:16  
16:29:16  1 of 1 START sql incremental model dbt_ryan.my_model ........................... [RUN]
16:29:16  Yes, refresh me!
16:29:16  True
16:29:16  True
16:29:17  1 of 1 OK created sql incremental model dbt_ryan.my_model ...................... [SELECT 1 in 0.10s]
16:29:17  
16:29:17  Finished running 1 incremental model in 0 hours 0 minutes and 0.25 seconds (0.25s).
16:29:17  
16:29:17  Completed successfully
16:29:17  
16:29:17  Done. PASS=1 WARN=0 ERROR=0 SKIP=0 TOTAL=1
~/Documents/junk/dbt_demo  dbt run --full-refresh
16:29:21  Running with dbt=1.7.2
16:29:21  Registered adapter: postgres=1.7.2
16:29:21  Found 1 model, 0 sources, 0 exposures, 0 metrics, 403 macros, 0 groups, 0 semantic models
16:29:21  
16:29:21  Concurrency: 1 threads (target='postgres')
16:29:21  
16:29:21  1 of 1 START sql incremental model dbt_ryan.my_model ........................... [RUN]
16:29:21  Yes, refresh me!
16:29:21  True
16:29:21  True
16:29:21  1 of 1 OK created sql incremental model dbt_ryan.my_model ...................... [SELECT 1 in 0.11s]
16:29:21  
16:29:21  Finished running 1 incremental model in 0 hours 0 minutes and 0.24 seconds (0.24s).
16:29:21  
16:29:21  Completed successfully
16:29:21  
16:29:21  Done. PASS=1 WARN=0 ERROR=0 SKIP=0 TOTAL=1
~/Documents/junk/dbt_demo  dbt run
16:30:30  Running with dbt=1.7.2
16:30:30  Registered adapter: postgres=1.7.2
16:30:31  Found 1 model, 0 sources, 0 exposures, 0 metrics, 403 macros, 0 groups, 0 semantic models
16:30:31  
16:30:31  Concurrency: 1 threads (target='postgres')
16:30:31  
16:30:31  1 of 1 START sql incremental model dbt_ryan.my_model ........................... [RUN]
16:30:31  No, leave me be!
16:30:31  False
16:30:31  None
16:30:31  1 of 1 OK created sql incremental model dbt_ryan.my_model ...................... [SELECT 1 in 0.11s]
16:30:31  
16:30:31  Finished running 1 incremental model in 0 hours 0 minutes and 0.25 seconds (0.25s).
16:30:31  
16:30:31  Completed successfully
16:30:31  
16:30:31  Done. PASS=1 WARN=0 ERROR=0 SKIP=0 TOTAL=1
~/Documents/junk/dbt_demo  dbt run
16:31:45  Running with dbt=1.7.2
16:31:46  Registered adapter: postgres=1.7.2
16:31:46  Found 1 model, 0 sources, 0 exposures, 0 metrics, 403 macros, 0 groups, 0 semantic models
16:31:46  
16:31:46  Concurrency: 1 threads (target='postgres')
16:31:46  
16:31:46  1 of 1 START sql incremental model dbt_ryan.my_model ........................... [RUN]
16:31:46  No, leave me be!
16:31:46  False
16:31:46  None
16:31:46  1 of 1 OK created sql incremental model dbt_ryan.my_model ...................... [SELECT 1 in 0.12s]
16:31:46  
16:31:46  Finished running 1 incremental model in 0 hours 0 minutes and 0.26 seconds (0.26s).
16:31:46  
16:31:46  Completed successfully
16:31:46  
16:31:46  Done. PASS=1 WARN=0 ERROR=0 SKIP=0 TOTAL=1
~/Documents/junk/dbt_demo  dbt run
16:34:02  Running with dbt=1.7.2
16:34:03  Registered adapter: postgres=1.7.2
16:34:07  Found 1 model, 0 sources, 0 exposures, 0 metrics, 403 macros, 0 groups, 0 semantic models
16:34:07  
16:34:07  Concurrency: 1 threads (target='postgres')
16:34:07  
16:34:07  1 of 1 START sql incremental model dbt_ryan.my_model ........................... [RUN]
16:34:07  No, leave me be!
16:34:07  False
16:34:07  None
16:34:07  1 of 1 OK created sql incremental model dbt_ryan.my_model ...................... [SELECT 1 in 0.16s]
16:34:07  
16:34:07  Finished running 1 incremental model in 0 hours 0 minutes and 0.41 seconds (0.41s).
16:34:07  
16:34:07  Completed successfully
16:34:07  
16:34:07  Done. PASS=1 WARN=0 ERROR=0 SKIP=0 TOTAL=1

Environment

- OS: Mac OSx
- Python: 3.9.13
- dbt: 1.7.2, also tried with 1.7.10

Which database adapter are you using with dbt?

postgres

Additional Context

No response

The text was updated successfully, but these errors were encountered:

dbeatty10 · 2024-03-21T16:48:09Z

Thanks for reaching out @rlh1994 !

I didn't read through your example carefully yet. But in the meantime, did try out including the --no-partial-parse flag?

rlh1994 · 2024-03-21T16:57:09Z

@dbeatty10 I'd just been exploring that before you commented, but yes that does force the correct full refresh behaviour, so it is related to the partial parsing!

dbeatty10 · 2024-03-22T21:06:17Z

@rlh1994 Glad that adding --no-partial-parse forced the behavior you are aiming for! 🎉

Here's a couple other debugging steps I did on this one:

Examining the manifest for the config value of full_refresh that was used in the most recent run
Logging in the macro depending on execute

Examining the manifest

I like using jq to quickly examine the values in the manifest.

In this case, you can examine the actual value of full_refresh that was set and utilized like this:

cat target/manifest.json| jq '.nodes."model.my_project.my_model".config.full_refresh'

Logging in the macro depending if `execute` is True or not

Since the config for the model is set prior to execution time, I'd use a macro like the following instead:

{% macro my_macro() %}
  {{ return(adapter.dispatch('my_macro', 'dbt_demo')()) }}
{% endmacro %}

{% macro default__my_macro() %}

  {{ log('', True)}}

  {% if execute == True %}
    {{ log('No, this is NOT included in model config() because execute=' ~ execute, info=True) }}
  {% else %}
    {{ log('Yes, this IS included in model config() because execute=' ~ execute, info=True) }}

    {% if flags.FULL_REFRESH == True %}
      {% set allow_refresh = true %}
      {{ log('Yes, refresh me!', info=True) }}
    {% else %}
      {% set allow_refresh = none %}
      {{ log('No, leave me be!', info=True) }}
    {% endif %}
      {{ log(flags.FULL_REFRESH, info=True) }}
      {{ log(allow_refresh, info=True) }}
    {% endif %}

  {{ log('', True)}}

  {{ return(allow_refresh) }}

{% endmacro %}

When you do a dbt run, you'll be able to see if "Yes, this IS included in model config()" shows up in the log output or not. If not, it means this macro didn't come into play early enough to change the value of the model's config.

Summary

Since we have a workaround via --no-partial-parse, I'm labeling this as "medium severity".

github-actions · 2024-09-19T01:58:30Z

This issue has been marked as Stale because it has been open for 180 days with no activity. If you would like the issue to remain open, please comment on the issue or else it will be closed in 7 days.

rlh1994 · 2024-09-19T15:09:06Z

rlh1994 added bug Something isn't working triage labels Mar 21, 2024

dbeatty10 added awaiting_response and removed triage labels Mar 21, 2024

rlh1994 changed the title ~~[Bug] Incremental model full refreshed when macro is used without full refresh flag (maybe macro result cached?)~~ [Bug] Full refresh model config not respected when coming from a macro and partial parsing is used Mar 21, 2024

github-actions bot added triage and removed awaiting_response labels Mar 21, 2024

dbeatty10 added partial_parsing Medium Severity bug with minor impact that does not have resolution timeframe requirement labels Mar 21, 2024

dbeatty10 removed the triage label Mar 22, 2024

github-actions bot added the stale Issues that have gone stale label Sep 19, 2024

github-actions bot removed the stale Issues that have gone stale label Sep 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug] Full refresh model config not respected when coming from a macro and partial parsing is used #9789

[Bug] Full refresh model config not respected when coming from a macro and partial parsing is used #9789

rlh1994 commented Mar 21, 2024 •

edited

Loading

dbeatty10 commented Mar 21, 2024

rlh1994 commented Mar 21, 2024

dbeatty10 commented Mar 22, 2024

github-actions bot commented Sep 19, 2024

rlh1994 commented Sep 19, 2024

[Bug] Full refresh model config not respected when coming from a macro and partial parsing is used #9789

[Bug] Full refresh model config not respected when coming from a macro and partial parsing is used #9789

Comments

rlh1994 commented Mar 21, 2024 • edited Loading

Is this a new bug in dbt-core?

Current Behavior

Expected Behavior

Steps To Reproduce

Relevant log output

Environment

Which database adapter are you using with dbt?

Additional Context

dbeatty10 commented Mar 21, 2024

rlh1994 commented Mar 21, 2024

dbeatty10 commented Mar 22, 2024

Examining the manifest

Logging in the macro depending if execute is True or not

Summary

github-actions bot commented Sep 19, 2024

rlh1994 commented Sep 19, 2024

rlh1994 commented Mar 21, 2024 •

edited

Loading

Logging in the macro depending if `execute` is True or not