Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Post-hook doesn't resolve custom schema #4023

Closed
4 tasks
ilmari-aalto opened this issue Oct 8, 2021 · 3 comments
Closed
4 tasks

Post-hook doesn't resolve custom schema #4023

ilmari-aalto opened this issue Oct 8, 2021 · 3 comments
Labels
bug Something isn't working

Comments

@ilmari-aalto
Copy link

Describe the bug

I'm running a model with a post-hook against a custom schema. The custom schema name is not correctly resolved in the post-hook sql when using {{ this }}. Instead the post-hook sql tries to run against the default schema and fails, because the table was created in a custom schema.

Steps To Reproduce

This example model my_post_hook_model.sql using the custom schema staging fails:

{% set post_hook_sql %}
delete
from {{ this }}
where id > 2
{% endset %}

{{ 
  config(
    schema='staging',
    post_hook=post_hook_sql,
  )
}}

select 1 as id
union all
select 2 as id
union all
select 999 as id

The error message is:

Database Error in model my_post_hook_model (models/yyy/zzz/my_post_hook_model.sql)
  relation "dbt_ilmari.my_post_hook_model" does not exist
  compiled SQL at target/run/xxx/models/yyy/zzz/my_post_hook_model.sql

Expected behavior

The post-hook should execute against dbt_ilmari_staging.my_post_hook_model. The schema name is resolved differently for the model (which is correctly created in dbt_ilmari_staging) and for the post-hook (which ignores the custom schema altogether and defaults to dbt_ilmari).

Screenshots and log output

If applicable, add screenshots or log output to help explain your problem.

System information

Which database are you using dbt with?

  • postgres
  • [x ] redshift
  • bigquery
  • snowflake
  • other (specify: ____________)

The output of dbt --version:

installed version: 0.21.0
   latest version: 0.21.0

Up to date!

Plugins:
  - bigquery: 0.21.0
  - snowflake: 0.21.0
  - redshift: 0.21.0
  - postgres: 0.21.0

The operating system you're using:
macOS Big Sur Version 11.4

The output of python --version:
Python 2.7.16

The output of python3 --version:
Python 3.8.12

Additional context

Add any other context about the problem here.

@ilmari-aalto ilmari-aalto added bug Something isn't working triage labels Oct 8, 2021
@jtcohen6 jtcohen6 removed the triage label Oct 12, 2021
@jtcohen6
Copy link
Contributor

jtcohen6 commented Oct 12, 2021

@ilmari-aalto To make a long story short, dbt Jinja-renders your model twice: once when the model is being parsed (extract dependencies, resolve configs), and again when the model is being executed.

You want your post-hook to be re-rendered at execute time. At parse time, the context variable {{ this }} isn't quite ready yet, so dbt uses a placeholder with default values instead. (No surprise there: dbt is still figuring out the database.schema.alias of your model, by resolving other configs.)

In order to tell dbt to re-render your post-hook at execute time, you need to use an extra set of quotes + curlies:

{{ 
  config(
    materialized='table',
    schema='staging',
    post_hook="delete from {{ this }} where id > 2",
  )
}}

select 1 as id
union all
select 2 as id
union all
select 999 as id

This is functionally different from your current syntax, which is instead the functional equivalent of:

{{ 
  config(
    materialized='table',
    schema='staging',
    post_hook="delete from " ~ this ~ " where id > 2",
  )
}}

The difference is subtle; it has to do with when {{ this }} is fully rendered.

This is the one documented exception to the rule about not nesting curly braces:

So why are curlies inside of curlies allowed in this case? Here, we actually want the string literal "grant select on {{ this }} ..." to be saved as the configuration value for the post-hook in this model. This string will be re-rendered when the model runs, resulting in a sensible sql expressions like grant select on "schema"."table".... being executed against the database. These hooks are a special exception to the rule stated above.

You could also set that config in your project file, with quotes and curlies:

# dbt_project.yml
models:
  +post-hook: "delete from {{ this }} where id > 2"

Or by storing the logic in a macro, and calling the macro in either place:

{% macro delete_from_this() %}
    delete from {{ this }} where id > 2
{% endmacro %}
{{ 
  config(
    materialized='table',
    schema='staging',
    post_hook="{{ delete_from_this() }}",
  )
}}
# dbt_project.yml
models:
  +post-hook: "{{ delete_from_this() }}"

In all cases, though, notice the extra set of quotes and curly braces, telling dbt to re-render this hook at execution time, with the availability of the full execution-time context. Note this would not work:

{{ 
  config(
    materialized='table',
    schema='staging',
    post_hook=delete_from_this(),
  )
}}

We got a couple issues about this recently (#3985, #3986), and the original issue for this goes back to #2370 (comment). So I do think we could use some better docs for this non-obvious functionality, not to mention a future re-think about more intuitive behavior.

In my view, we should aim to remove the functional distinction between quoted/unquoted, extra-curlied/uncurlied. I think that would require us to:

For the meantime, I'm going to close this issue, as there's a viable workaround.

@ilmari-aalto
Copy link
Author

Thanks a ton @jtcohen6 for your detailed answer, I definitely learned something new today!
I'll follow the workaround you proposed!

@Callumoco
Copy link

This has been super helpful @jtcohen6! Thank you 🙇

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants