Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] Full refresh model config not respected when coming from a macro and partial parsing is used #9789

Open
2 tasks done
rlh1994 opened this issue Mar 21, 2024 · 5 comments
Labels
bug Something isn't working Medium Severity bug with minor impact that does not have resolution timeframe requirement partial_parsing

Comments

@rlh1994
Copy link
Contributor

rlh1994 commented Mar 21, 2024

Is this a new bug in dbt-core?

  • I believe this is a new bug in dbt-core
  • I have searched the existing issues, and I could not find an existing issue for this bug

Current Behavior

The situation is when I use a macro to define if a model should full refresh or not, despite the macro returning none the model still sometimes full refreshes without using the --full-refresh flag. This seems to be related to partial parsing as when this is disabled, or there is something that forces a full parse, this does not occur.

Expected Behavior

The incremental model to respect the macro value/full refresh flag.

Steps To Reproduce

  1. New dbt project, add the following to a macro called my_macro.sql:
    {% macro my_macro() %}
      {{ return(adapter.dispatch('my_macro', 'dbt_demo')()) }}
    {% endmacro %}
    
    {% macro default__my_macro() %}
    
      {% if flags.FULL_REFRESH == True %}
        {% set allow_refresh = true %}
        {{ log('Yes, refresh me!', info=True) }}
      {% else %}
         {% set allow_refresh =none %}
        {{ log('No, leave me be!', info=True) }}
      {% endif %}
        {{ log(flags.FULL_REFRESH, info=True) }}
        {{ log(allow_refresh, info=True) }}
      {{ return(allow_refresh) }}
    
    {% endmacro %}
  2. Add a model called my_model.sql with the following content:
    {{
      config(
        materialized = 'incremental',
        full_refresh=my_macro(),
        )
    }}
    
    select 1 as test
  3. Run the following sets of commands against a postgres target (although I have seen this issue against other targets as well:
    dbt clean
    dbt run --full-refresh
    dbt run --full-refresh
    Note in both run cases, the output is SELECT 1, and if you query the data you will see only 1 record - as expected so far.
  4. Run the following sets of commands against a postgres target (although I have seen this issue against other targets as well:
    dbt run
    dbt run
    You can query the table after each, and notice the output is still SELECT 1, and find that only a single record exists each time. You can also check the target run code and see it's using the create instead of an insert.
  5. Do a dbt run --no-partial-parse and it will trigger an insert correctly.

Weirdly, the flag is correctly shown as False in the log, the return value is shown as None, so it's not clear why a full refresh is taking place.

Changing something in the model or the macro (possibly also the project file, didn't test that) seems to then kick it to correct itself and from then onwards it correctly respects the flag. Basically anything that means you can't do partial parsing anymore.

Relevant log output

~/Documents/junk/dbt_demo  dbt run --full-refresh
16:29:15  Running with dbt=1.7.2
16:29:16  Registered adapter: postgres=1.7.2
16:29:16  Unable to do partial parsing because saved manifest not found. Starting full parse.
16:29:16  Yes, refresh me!
16:29:16  True
16:29:16  True
16:29:16  Found 1 model, 0 sources, 0 exposures, 0 metrics, 403 macros, 0 groups, 0 semantic models
16:29:16  
16:29:16  Concurrency: 1 threads (target='postgres')
16:29:16  
16:29:16  1 of 1 START sql incremental model dbt_ryan.my_model ........................... [RUN]
16:29:16  Yes, refresh me!
16:29:16  True
16:29:16  True
16:29:17  1 of 1 OK created sql incremental model dbt_ryan.my_model ...................... [SELECT 1 in 0.10s]
16:29:17  
16:29:17  Finished running 1 incremental model in 0 hours 0 minutes and 0.25 seconds (0.25s).
16:29:17  
16:29:17  Completed successfully
16:29:17  
16:29:17  Done. PASS=1 WARN=0 ERROR=0 SKIP=0 TOTAL=1
~/Documents/junk/dbt_demo  dbt run --full-refresh
16:29:21  Running with dbt=1.7.2
16:29:21  Registered adapter: postgres=1.7.2
16:29:21  Found 1 model, 0 sources, 0 exposures, 0 metrics, 403 macros, 0 groups, 0 semantic models
16:29:21  
16:29:21  Concurrency: 1 threads (target='postgres')
16:29:21  
16:29:21  1 of 1 START sql incremental model dbt_ryan.my_model ........................... [RUN]
16:29:21  Yes, refresh me!
16:29:21  True
16:29:21  True
16:29:21  1 of 1 OK created sql incremental model dbt_ryan.my_model ...................... [SELECT 1 in 0.11s]
16:29:21  
16:29:21  Finished running 1 incremental model in 0 hours 0 minutes and 0.24 seconds (0.24s).
16:29:21  
16:29:21  Completed successfully
16:29:21  
16:29:21  Done. PASS=1 WARN=0 ERROR=0 SKIP=0 TOTAL=1
~/Documents/junk/dbt_demo  dbt run
16:30:30  Running with dbt=1.7.2
16:30:30  Registered adapter: postgres=1.7.2
16:30:31  Found 1 model, 0 sources, 0 exposures, 0 metrics, 403 macros, 0 groups, 0 semantic models
16:30:31  
16:30:31  Concurrency: 1 threads (target='postgres')
16:30:31  
16:30:31  1 of 1 START sql incremental model dbt_ryan.my_model ........................... [RUN]
16:30:31  No, leave me be!
16:30:31  False
16:30:31  None
16:30:31  1 of 1 OK created sql incremental model dbt_ryan.my_model ...................... [SELECT 1 in 0.11s]
16:30:31  
16:30:31  Finished running 1 incremental model in 0 hours 0 minutes and 0.25 seconds (0.25s).
16:30:31  
16:30:31  Completed successfully
16:30:31  
16:30:31  Done. PASS=1 WARN=0 ERROR=0 SKIP=0 TOTAL=1
~/Documents/junk/dbt_demo  dbt run
16:31:45  Running with dbt=1.7.2
16:31:46  Registered adapter: postgres=1.7.2
16:31:46  Found 1 model, 0 sources, 0 exposures, 0 metrics, 403 macros, 0 groups, 0 semantic models
16:31:46  
16:31:46  Concurrency: 1 threads (target='postgres')
16:31:46  
16:31:46  1 of 1 START sql incremental model dbt_ryan.my_model ........................... [RUN]
16:31:46  No, leave me be!
16:31:46  False
16:31:46  None
16:31:46  1 of 1 OK created sql incremental model dbt_ryan.my_model ...................... [SELECT 1 in 0.12s]
16:31:46  
16:31:46  Finished running 1 incremental model in 0 hours 0 minutes and 0.26 seconds (0.26s).
16:31:46  
16:31:46  Completed successfully
16:31:46  
16:31:46  Done. PASS=1 WARN=0 ERROR=0 SKIP=0 TOTAL=1
~/Documents/junk/dbt_demo  dbt run
16:34:02  Running with dbt=1.7.2
16:34:03  Registered adapter: postgres=1.7.2
16:34:07  Found 1 model, 0 sources, 0 exposures, 0 metrics, 403 macros, 0 groups, 0 semantic models
16:34:07  
16:34:07  Concurrency: 1 threads (target='postgres')
16:34:07  
16:34:07  1 of 1 START sql incremental model dbt_ryan.my_model ........................... [RUN]
16:34:07  No, leave me be!
16:34:07  False
16:34:07  None
16:34:07  1 of 1 OK created sql incremental model dbt_ryan.my_model ...................... [SELECT 1 in 0.16s]
16:34:07  
16:34:07  Finished running 1 incremental model in 0 hours 0 minutes and 0.41 seconds (0.41s).
16:34:07  
16:34:07  Completed successfully
16:34:07  
16:34:07  Done. PASS=1 WARN=0 ERROR=0 SKIP=0 TOTAL=1

Environment

- OS: Mac OSx
- Python: 3.9.13
- dbt: 1.7.2, also tried with 1.7.10

Which database adapter are you using with dbt?

postgres

Additional Context

No response

@rlh1994 rlh1994 added bug Something isn't working triage labels Mar 21, 2024
@dbeatty10
Copy link
Contributor

Thanks for reaching out @rlh1994 !

I didn't read through your example carefully yet. But in the meantime, did try out including the --no-partial-parse flag?

@rlh1994 rlh1994 changed the title [Bug] Incremental model full refreshed when macro is used without full refresh flag (maybe macro result cached?) [Bug] Full refresh model config not respected when coming from a macro and partial parsing is used Mar 21, 2024
@rlh1994
Copy link
Contributor Author

rlh1994 commented Mar 21, 2024

@dbeatty10 I'd just been exploring that before you commented, but yes that does force the correct full refresh behaviour, so it is related to the partial parsing!

@dbeatty10 dbeatty10 added partial_parsing Medium Severity bug with minor impact that does not have resolution timeframe requirement labels Mar 21, 2024
@dbeatty10
Copy link
Contributor

@rlh1994 Glad that adding --no-partial-parse forced the behavior you are aiming for! 🎉

Here's a couple other debugging steps I did on this one:

  1. Examining the manifest for the config value of full_refresh that was used in the most recent run
  2. Logging in the macro depending on execute

Examining the manifest

I like using jq to quickly examine the values in the manifest.

In this case, you can examine the actual value of full_refresh that was set and utilized like this:

cat target/manifest.json| jq '.nodes."model.my_project.my_model".config.full_refresh'

Logging in the macro depending if execute is True or not

Since the config for the model is set prior to execution time, I'd use a macro like the following instead:

{% macro my_macro() %}
  {{ return(adapter.dispatch('my_macro', 'dbt_demo')()) }}
{% endmacro %}

{% macro default__my_macro() %}

  {{ log('', True)}}

  {% if execute == True %}
    {{ log('No, this is NOT included in model config() because execute=' ~ execute, info=True) }}
  {% else %}
    {{ log('Yes, this IS included in model config() because execute=' ~ execute, info=True) }}

    {% if flags.FULL_REFRESH == True %}
      {% set allow_refresh = true %}
      {{ log('Yes, refresh me!', info=True) }}
    {% else %}
      {% set allow_refresh = none %}
      {{ log('No, leave me be!', info=True) }}
    {% endif %}
      {{ log(flags.FULL_REFRESH, info=True) }}
      {{ log(allow_refresh, info=True) }}
    {% endif %}

  {{ log('', True)}}

  {{ return(allow_refresh) }}

{% endmacro %}

When you do a dbt run, you'll be able to see if "Yes, this IS included in model config()" shows up in the log output or not. If not, it means this macro didn't come into play early enough to change the value of the model's config.

Summary

Since we have a workaround via --no-partial-parse, I'm labeling this as "medium severity".

@dbeatty10 dbeatty10 removed the triage label Mar 22, 2024
Copy link
Contributor

This issue has been marked as Stale because it has been open for 180 days with no activity. If you would like the issue to remain open, please comment on the issue or else it will be closed in 7 days.

@github-actions github-actions bot added the stale Issues that have gone stale label Sep 19, 2024
@rlh1994
Copy link
Contributor Author

rlh1994 commented Sep 19, 2024

image

@github-actions github-actions bot removed the stale Issues that have gone stale label Sep 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Medium Severity bug with minor impact that does not have resolution timeframe requirement partial_parsing
Projects
None yet
Development

No branches or pull requests

2 participants