You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Support temporary UDFs by adding a wrapping {{ call }} around the UDF. This call will invoke a bigquery adapter.sql macro to set a config value with the UDF, which gets inserted before create or replace table in the query sent to BigQuery.
This would work for any ddl that someone may want to be executed before their SQL query in BigQuery.
ex.
{{%- call bigquery_ddl() -%}}
CREATE TEMPORARY FUNCTION yes_no_to_boolean(answer STRING)
RETURNS BOOLEAN AS (
CASE
WHEN LOWER(answer) = 'yes' THEN True
WHEN LOWER(answer) = 'no' THEN False
ELSE NULL
END
);
{{%- endcall -%}}
Currently (without the wrapping call), this will cause a BigQuery error because UDFs cannot be defined inside of the create or replace table that all SQL gets wrapped in for the BigQuery adapter. With this feature, this UDF (and any other ddl) would be pulled outside of create or replace table, allowing for the ensuing query to invoke the UDF as yes_no_to_boolean().
As it currently stands, wrapping queries in create or replace table restricts some of the full sql capabilities in BigQuery, and it would be useful to enable some of those features by having a mechanism to pull sql outside of the create or replace table.
Describe alternatives you've considered
This particular function could be replaced by a Jinja macro, but there may be functions that make more sense to implement as UDFs (ex. JavaScript UDFs in BQ). Unless dbt plans to support a superset of everything UDFs or other ddl statements that belong outside of create or replace table can do in BigQuery, it makes sense to have a path to using BQ ddl in dbt.
It is not necessary to call a macro to add part of a query to the config, but it is much cleaner to wrap with a {{ call }} than do something like:
{{ config(
materialized='table',
udf='''
CREATE TEMP FUNCTION greeting(a STRING)
RETURNS STRING
LANGUAGE js AS """
return "Hello, " + a + "!";
""";
'''
)-}
The implementation of the macro could allow for multiple "wraps" per query, appending to a list, but it could also enforce that there is only one call per query, or anything else. For simplicity, I favor enforcing one call because, if the contents of the call get placed before create or replace table, there isn't a huge advantage to interspersing them in your original .sql query.
Additional context
This is a BigQuery specific feature that depends on the current use of create or replace table to wrap the contents of a query.
This is a feature that I definitely plan on implementing, and would love to get suggestions / feedback on to make as useful as possible.
Who will this benefit?
This will benefit BigQuery users who have a need for a feature in BigQuery that is constrained by create or replace table, or who might prefer to use SQL / ddl to macros / Jinja.
The text was updated successfully, but these errors were encountered:
hey @kconvey - are you able to use BigQuery's new persistent UDFs here? This is how UDFs typically work on other databases. We have a related issue for supporting UDFs more natively in dbt too. In this world, you'd just create a permanent UDF, then reference it from your model code.
I do like your suggestion for how this could work, but I feel like temporary UDFs are the only real use case here, right? What are some examples of other types of SQL you'd want to inject in front of the create or replace query?
In this context, we're hoping to use temporary UDFs instead of creating globally persisted UDFs. These temporary functions are just to make the SQL a little easier to understand. We also don't want to deal with the life-cycle and increased visibility of a persistent UDF.
For the moment, we would just be injecting temporary UDFs, so could limit the scope to just these.
Describe the feature
Support temporary UDFs by adding a wrapping
{{ call }}
around the UDF. This call will invoke a bigquery adapter.sql macro to set a config value with the UDF, which gets inserted beforecreate or replace table
in the query sent to BigQuery.This would work for any ddl that someone may want to be executed before their SQL query in BigQuery.
ex.
Currently (without the wrapping call), this will cause a BigQuery error because UDFs cannot be defined inside of the
create or replace table
that all SQL gets wrapped in for the BigQuery adapter. With this feature, this UDF (and any other ddl) would be pulled outside ofcreate or replace table
, allowing for the ensuing query to invoke the UDF asyes_no_to_boolean()
.As it currently stands, wrapping queries in
create or replace table
restricts some of the full sql capabilities in BigQuery, and it would be useful to enable some of those features by having a mechanism to pull sql outside of thecreate or replace table
.Describe alternatives you've considered
This particular function could be replaced by a Jinja macro, but there may be functions that make more sense to implement as UDFs (ex. JavaScript UDFs in BQ). Unless dbt plans to support a superset of everything UDFs or other ddl statements that belong outside of
create or replace table
can do in BigQuery, it makes sense to have a path to using BQ ddl in dbt.It is not necessary to call a macro to add part of a query to the config, but it is much cleaner to wrap with a
{{ call }}
than do something like:The implementation of the macro could allow for multiple "wraps" per query, appending to a list, but it could also enforce that there is only one call per query, or anything else. For simplicity, I favor enforcing one call because, if the contents of the call get placed before
create or replace table
, there isn't a huge advantage to interspersing them in your original .sql query.Additional context
This is a BigQuery specific feature that depends on the current use of
create or replace table
to wrap the contents of a query.This is a feature that I definitely plan on implementing, and would love to get suggestions / feedback on to make as useful as possible.
Who will this benefit?
This will benefit BigQuery users who have a need for a feature in BigQuery that is constrained by
create or replace table
, or who might prefer to use SQL / ddl to macros / Jinja.The text was updated successfully, but these errors were encountered: