Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CT-2769] [Feature] Allow yaml/jinja to get rendered with additional context #8000

Closed
3 tasks done
kdazzle opened this issue Jun 29, 2023 · 6 comments
Closed
3 tasks done
Labels
enhancement New feature or request

Comments

@kdazzle
Copy link

kdazzle commented Jun 29, 2023

Is this your first time submitting a feature request?

  • I have read the expectations for open source contributors
  • I have searched the existing issues, and I could not find an existing issue for this feature
  • I am requesting a straightforward extension of existing dbt functionality, rather than a Big Idea better suited to a discussion

Describe the feature

Right now, context used to render yaml files can be passed through the cli or the profile. It would be nice if there was also a hook to allow developers to define additional context variables. I imagine there could be a file like macros/context.py that would define a function get_context(target) -> dict or something like that.

The hook seems like it would be in core/dbt/config/renderer.DbtprojectYamlRenderer (maybe in the init where the rest of the context building seems to happen?). Regardless of where it ended up, the yaml hook could look like:

# core/dbt/config/renderer.py 
class DbtProjectYamlRenderer:

    def __init__(...):
        # Existing code
        if cli_vars is None:
            cli_vars = {}
        if profile:
            self.ctx_obj = TargetContext(profile.to_target_dict(), cli_vars)
        else:
            self.ctx_obj = BaseContext(cli_vars)  # type:ignore
        context = self.ctx_obj.to_dict()
        
        # The hook!
        additional_context = self.get_additional_context(context)
        context.update(additional_context)
        
        super().__init__(context)

    def get_additional_context(self, context):
        # try to import get_context from the macros/context.py in the developer's project folder
            additional_context = get_context(context)
        except ImportErrorOrSomething:
            return {}

I haven't looked into how Jinja could tie in, but I'm hoping it would be as straightforward.

Thanks!

Describe alternatives you've considered

Setting a bunch of variables using Jinja, setting variables in yaml, which is usually fine, but not as nice

Who will this benefit?

Projects that might have non-standard setups, want a little more flexibility, and are more comfortable writing code

Are you interested in contributing this feature?

yes

Anything else?

No response

@kdazzle kdazzle added enhancement New feature or request triage labels Jun 29, 2023
@github-actions github-actions bot changed the title [Feature] Allow yaml/jinja to get rendered with additional context [CT-2769] [Feature] Allow yaml/jinja to get rendered with additional context Jun 29, 2023
@jtcohen6
Copy link
Contributor

@kdazzle Thanks for the thoughtful write-up!

I'm not sure if you're asking for the same (or similar) capability as Maxime last year: methods written in Python, callable from the Jinja contxt. Here's what I wrote about it then:

@kdazzle
Copy link
Author

kdazzle commented Jun 30, 2023

Thanks for the response @jtcohen6. Nice memory 😄 - it does look like I'm proposing the same thing as Maxime. I'll take a look into the plugin approach.

I'm a big fan of dbt, but have found that I'm wrestling with the tool, and that allowing more flexibility for developers could be helpful as the audience expands. Although my issues could be a learning curve thing, or maybe plugins are the solution.

Thanks again!

@jtcohen6
Copy link
Contributor

@kdazzle If there are specific things you've been trying to accomplish, where you've found yourself wrestling / working against the grain — I'd be curious to hear more about them! dbt is an opinionated tool, in that there are common use cases in data/analytics engineering that we want to make easy — at the same, there's a lot that's possible, because the framework has some flexibility built in, especially via user-space Jinja.

Sometimes, a use case is totally legitimate, and still harder than it should be. In those cases, we want to hear more & discuss what a better built-in solution could look like, or which building blocks we're missing that would make it easier.

@jtcohen6 jtcohen6 removed the triage label Jun 30, 2023
@kdazzle
Copy link
Author

kdazzle commented Jul 4, 2023

Hi @jtcohen6 - here are a couple of examples. They're just little frustrations, nothing major, but I've been a Python developer for a while, so I'm used to being able to override/extend functionality (like in, say, Django Rest Framework). Generally, I try to stay away from doing too much in Jinja, since it's just a templating language.

I'm integrating dbt into an existing code/database with lots of users, instead of starting fresh. There's a lot of existing infrastructure and roles already defined.

Examples:

  • I wanted to create a function to get the role name for the current environment. (solved using dbt config vars, though the jinja dict getter for every table isn't beautiful)
    config:
      grants:
        select: [
          "{{ var('permissions')[env_var('DBT_TARGET_NAME', 'dev')]['marketing-user-read'] }}"
          
          vs. something like:
          "{{ context['permissions']['marketing-user-read'] }}"

          or:
          "{{ get_role('marketing-user-read') }}"
        ]
  • Another example - We're using databricks and I would like to define some default cluster specs so that people don't have to spend an hour trying to figure that part out. I picture calling a function to get that configuration, which I can use in the python/sql config().
default_cluster = get_default_cluster()
config(materialization='table', **default_cluster)

But that doesn't work for a couple reasons (one being some static analysis that prevents using variables?)

However, I haven't actually tried this too much yet, and the solution might be as simple as defining a dictionary in the dbt config and referencing that in the notebook/config.

  • Another potential use? These context variables might even work as a hook to bring reusable functions into python notebooks

Anyways, I appreciate all your work, and thanks for the help!

@jtcohen6
Copy link
Contributor

jtcohen6 commented Jul 5, 2023

Heard! It sounds like there are two main limitations you're running into:

  • Inability to call custom macros from within YAML file configs. I agree that "{{ get_role('marketing-user-read') }}" is more ergonomic than the current approach (although cleverly done!), and you could call it from within a Jinja-SQL model's {{ config(grants = ...) }} block, but not from within YAML configs.
  • The way that dbt statically analyzes the configs of Python models means that you're limited to passing in literal values. This wouldn't be solved by extending the Jinja context, since we disallow using Jinja to template Python model code today. My motivation behind that guardrail is, Jinja-templated Python would lead to model code that's very hard to read & reason about. It's an opinionated guardrail, and potentially one that we should revisit in the future.

These context variables might even work as a hook to bring reusable functions into python notebooks

Hear you on this one too. The lack of code reusability across Python models (akin to Jinja macros for SQL models) is definitely something we've heard from folks who are looking to scale out their adoption. Again, I hesitate to release Jinja macros for DRY-ly templating Python into the world, as I feel like there are better options here — but it isn't something we've been able to prioritize as an area of focus this year.

@ismailsimsek
Copy link

@kdazzle created discussion here dbt-labs/dbt-adapters#259

IMO Adapter factory should allow user provided customized adapter class

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants