Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CT-3505] [Bug] Unable to access global variables like schemas and database_schemas in dbt_project.yml on-run-end #9301

Closed
2 tasks done
gclarkjr5 opened this issue Dec 18, 2023 · 7 comments
Labels
bug Something isn't working wontfix Not a bug or out of scope for dbt-core

Comments

@gclarkjr5
Copy link

gclarkjr5 commented Dec 18, 2023

Is this a new bug in dbt-core?

  • I believe this is a new bug in dbt-core
  • I have searched the existing issues, and I could not find an existing issue for this bug

Current Behavior

I am currently trying to grant usage for a user to all schemas. I have used the examples provided and have the following:

my macro grants.sql

{% macro grant_usage_to_schemas(schemas, user) %} -- noqa: LT02
  {% for schema in schemas %}
    grant usage on schema {{ schema }} to {{ user }};
  {% endfor %}
{% endmacro %}

my dbt_project.yml

models:
  warehouse:
    bv:
      +tags: 'core'
      +schema: 'bv'
      +grants:
        select: ['rds_readonly']
    source_core:
      +tags: 'core'
      +schema: 'sc'
      +grants:
        select: ['rds_readonly']
    unified_core:
      +tags: 'core'
      +schema: 'uc'
      +grants:
        select: [ 'rds_readonly' ]
    intermediate:
      +tags: 'core'
      +schema: 'int'
      +grants:
        select: [ 'rds_readonly' ]
  elementary:
    +tags: 'elementary'
    +schema: "elementary"
    +grants:
      select: [ 'rds_readonly' ]


on-run-end:
  - "{{ grant_usage_to_schemas(schemas, 'rds_readonly') }}"
  - "{{ grant_usage_to_schemas(['dbt_bv', 'dbt_sc', 'dbt_uc', 'dbt_int'], 'rds_readonly') }}"

while the 2nd on-run-end hook produces what i need, I would have expected the first one to do the same thing. But instead i get an empty output after running dbt compile

/target/compiled/warehouse/dbt_project.yml/hooks/warehouse-on-run-end-0.sql

 -- noqa: LT02
  

Expected Behavior

after running dbt compile, I would have expected to see the compiled SQL of my on-run-end hooks. Instead, the one where i explicitly use the schemas object shows nothing, like in the above code block. However, I would have expected the following.

/target/compiled/warehouse/dbt_project.yml/hooks/warehouse-on-run-end-0.sql

 -- noqa: LT02
  
    grant usage on schema dbt_bv to rds_readonly;
  
    grant usage on schema dbt_sc to rds_readonly;
  
    grant usage on schema dbt_uc to rds_readonly;
  
    grant usage on schema dbt_int to rds_readonly;

/target/compiled/warehouse/dbt_project.yml/hooks/warehouse-on-run-end-1.sql

 -- noqa: LT02
  
    grant usage on schema dbt_bv to rds_readonly;
  
    grant usage on schema dbt_sc to rds_readonly;
  
    grant usage on schema dbt_uc to rds_readonly;
  
    grant usage on schema dbt_int to rds_readonly;
  

Steps To Reproduce

My full dbt_project.yml

name: 'warehouse'
version: '0.0.1'
config-version: 2

profile: 'warehouse'

model-paths: ["models"]
analysis-paths: ["analyses"]
test-paths: ["tests"]
seed-paths: ["seeds"]
macro-paths: ["macros"]
snapshot-paths: ["snapshots"]

target-path: "target"
clean-targets:
  - "target"
  - "dbt_packages"


# Configuring models
# Full documentation: https://docs.getdbt.com/docs/configuring-models

models:
  warehouse:
    bv:
      +tags: 'core'
      +schema: 'bv'
      +grants:
        select: ['rds_readonly']
    source_core:
      +tags: 'core'
      +schema: 'sc'
      +grants:
        select: ['rds_readonly']
    unified_core:
      +tags: 'core'
      +schema: 'uc'
      +grants:
        select: [ 'rds_readonly' ]
    intermediate:
      +tags: 'core'
      +schema: 'int'
      +grants:
        select: [ 'rds_readonly' ]
  elementary:
    +tags: 'elementary'
    +schema: "elementary"
    +grants:
      select: [ 'rds_readonly' ]


on-run-end:
  - "{{ grant_usage_to_schemas(schemas, 'rds_readonly') }}"
  - "{{ grant_usage_to_schemas(['dbt_bv', 'dbt_sc', 'dbt_uc', 'dbt_int'], 'rds_readonly') }}"

Create a test model for each schema. Then run dbt compile.

Relevant log output

No response

Environment

- OS: MacOS Sonoma 14.2
- Docker Container: Linux 6e14e43e7b7f 5.15.49-linuxkit-pr #1 SMP PREEMPT Thu May 25 07:27:39 UTC 2023 x86_64 GNU/Linux
- Docker Image: ghcr.io/dbt-labs/dbt-postgres:1.5.1
- Python: 3.11.2
- dbt: 1.5.1

Which database adapter are you using with dbt?

postgres

Additional Context

I can attach the logs if there is no quick solution/the issue is immediately apparent to someone.

@gclarkjr5 gclarkjr5 added bug Something isn't working triage labels Dec 18, 2023
@github-actions github-actions bot changed the title [Bug] Unable to access global variables like schemas and database_schemas in dbt_project.yml on-run-end [CT-3505] [Bug] Unable to access global variables like schemas and database_schemas in dbt_project.yml on-run-end Dec 18, 2023
@dbeatty10
Copy link
Contributor

Thanks for reaching out @gclarkjr5 !

It looks like you are using the schemas variable within the context of an on-run-end hook.

Can you share more about your use-case why you're hoping to see these listed in /target/compiled/warehouse/dbt_project.yml/hooks/warehouse-on-run-end-0.sql following a dbt compile?

If you run dbt run (instead of dbt compile), you should see those grant statements show up in both that target file you mentioned as well as logs/dbt.log.

Here's a minimal example:

Minimal example

dbt_project.yml

name: "warehouse"
version: "1.0.0"
config-version: 2
profile: "warehouse"

models:
  warehouse:
    +schema: 'some_schema'

on-run-end:
  - "{{ grant_usage_to_schemas(schemas, 'rds_readonly') }}"

models/my_model.sql

select 1 as id

Compile-only first:

dbt compile

Examine the file (target/compiled/warehouse/dbt_project.yml/hooks/my_project-on-run-end-0.sql):

 -- noqa: LT02
  

Now actually run the model:

dbt run -s my_model

Examine the file again (target/compiled/warehouse/dbt_project.yml/hooks/my_project-on-run-end-0.sql):

 -- noqa: LT02
  
    grant usage on schema dbt_dbeatty_some_schema to rds_readonly;
  

You should see that same content within logs/dbt.log as well.

@alison985
Copy link

@dbeatty10 If you dbt run you spend money running a database query as opposed to being able to check it will run correctly before spending money(e.g. BigQuery, Snowflake). Also, there's no reason for it not to happen in a dbt compile because it's a dbt variable, not a variable to pull from the database. Finally, the fewer differences between test environments/events and prod environments/events the better.

@gclarkjr5 I notice you don't expect +schema: "elementary" to be in the output even though it is a schema. Is that because it's under a different sub-folder, because it's in " not ', or something else?

I also notice that (['dbt_bv', 'dbt_sc', 'dbt_uc', 'dbt_int'], uses [] and (schemas, does not. While I'd be surprised if ([schemas], worked, it may be worth a try.

@dbeatty10
Copy link
Contributor

Thanks for your insights @alison985 💡

@dbeatty10
Copy link
Contributor

@gclarkjr5 I can see how it would be nice to be able to do a "dry run" using dbt compile and then see the logic that dbt will use.

But in this case, the reason that schemas and database_schemas aren't showing after dbt compile is because those variables only include databases and schemas that dbt has built models into during a run of dbt.

So in the case of dbt compile, those are context variables are empty because dbt hasn't built anything yet!

Accordingly, I'm going to close this as "not planned" since those context variables are behaving in the way they are documented.

@dbeatty10 dbeatty10 closed this as not planned Won't fix, can't repro, duplicate, stale Feb 1, 2024
@dbeatty10 dbeatty10 added wontfix Not a bug or out of scope for dbt-core and removed triage labels Feb 1, 2024
@gclarkjr5
Copy link
Author

gclarkjr5 commented Feb 7, 2024

So if anyone is interested in how I was able to fix this, here was what I was finally able to uncover.

In the setup we have, Airflow is what orchestrates DBT. The people from Astronomer have some python code that parses a DBT manifest file, and then re-creates the dependency chain in Airflow as a DAG so that your models still run in their correct order. You can find the Astronomer article & code here. Each model therefore is its own "run". And since each model is its own "run" it will run the on-run-end hook for EACH model. This actually isn't a problem since we are just providing grants to our schemas/tables and this can be ran any number of times. The problem however, was actually doing the "tests". The parsed DBT manifest creates a DAG in Airflow that splits model "runs" into one Airflow task group, and the "tests" of those models in a following Airflow task group. The test section runs almost an identical command to the run except for the command invocation, so dbt test .... instead of dbt run ..... It seems that the "testing" will ALSO call the on-run-end hook after the test is finished, and this is where the on-run-end hook was compiling to an EMPTY QUERY that would error. So the solution (maybe hack) I put in place is the following for the "grants macro".

{% macro grant_usage_to_schemas(schemas, user) %}
  {% if flags.WHICH == "run" %}
  
    {% for schema in schemas %}
      grant usage on schema {{ schema }} to {{ user }};
    {% endfor %}

  {% else %}
    select 1;
  {% endif %}
{% endmacro %}

DBT allows you to create conditional logic around the flags or commands passed to it. In this case, if its a "run", run the grant, else perform a select 1, since at least something needs to happen. Hope this helps someone also experiencing this.

@ataft
Copy link

ataft commented Sep 19, 2024

But in this case, the reason that schemas and database_schemas aren't showing after dbt compile is because those variables only include databases and schemas that dbt has built models into during a run of dbt.

@dbeatty10 I'm noticing that schemas is also empty for dbt run when there is an error. Is this issue related to your comment above, or a separate issue?

@dbeatty10
Copy link
Contributor

I'm noticing that schemas is also empty for dbt run when there is an error. Is this issue related to your comment above, or a separate issue?

@ataft I didn't check the code to confirm for certain, but my belief is that it's related to my comment above.

i.e., if a dbt run (or build) doesn't succeed, then schemas and database_schemas will be empty.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working wontfix Not a bug or out of scope for dbt-core
Projects
None yet
Development

No branches or pull requests

4 participants