You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently, generate_source has an option to set include_descriptions = True, but this parameters only includes descriptions at the column level. Ideally, description placeholders would also be generated for the source and tables as well. Additionally, the required parameter for generate_source macro is the schema name, but there is no option to input a name value. It is possible that a user would like to name their source a different name from the schema name.
Describe alternatives you've considered
I can manually update the yaml that is generated from the current generate_source macro, but this is time consuming and prone to yaml formatting issues.
Additional context
I have working code in my own dbt project that I believe solves for both the descriptions at the source/table level, as well as the source name <> schema name. See below.
{% macro get_tables_in_schema(schema_name, database_name=target.database, table_pattern='%', exclude='') %}
{% set tables=dbt_utils.get_relations_by_pattern(
schema_pattern=schema_name,
database=database_name,
table_pattern=table_pattern,
exclude=exclude
) %}
{% set table_list= tables | map(attribute='identifier') %}
{{ return(table_list | sort) }}
{% endmacro %}
---
{% macro generate_source(schema_name, name = schema_name, database_name=target.database, generate_columns=False, include_descriptions=False, table_pattern='%', exclude='') %}
{% set sources_yaml=[] %}
{% do sources_yaml.append('version: 2') %}
{% do sources_yaml.append('') %}
{% do sources_yaml.append('sources:') %}
{% do sources_yaml.append(' - name: ' ~ name | lower) %}
{% if include_descriptions %}
{% do sources_yaml.append(' description: ""' ) %}
{% endif %}
{% if database_name != target.database %}
{% do sources_yaml.append(' database: ' ~ database_name | lower) %}
{% endif %}
{% if schema_name != name %}
{% do sources_yaml.append(' schema: ' ~ schema_name | lower) %}
{% endif %}
{% do sources_yaml.append(' tables:') %}
{% set tables=codegen.get_tables_in_schema(schema_name, database_name, table_pattern, exclude) %}
{% for table in tables %}
{% do sources_yaml.append(' - name: ' ~ table | lower ) %}
{% if include_descriptions %}
{% do sources_yaml.append(' description: ""' ) %}
{% endif %}
{% if generate_columns %}
{% do sources_yaml.append(' columns:') %}
{% set table_relation=api.Relation.create(
database=database_name,
schema=schema_name,
identifier=table
) %}
{% set columns=adapter.get_columns_in_relation(table_relation) %}
{% for column in columns %}
{% do sources_yaml.append(' - name: ' ~ column.name | lower ) %}
{% if include_descriptions %}
{% do sources_yaml.append(' description: ""' ) %}
{% endif %}
{% endfor %}
{% do sources_yaml.append('') %}
{% endif %}
{% endfor %}
{% if execute %}
{% set joined = sources_yaml | join ('\n') %}
{{ log(joined, info=True) }}
{% do return(joined) %}
{% endif %}
{% endmacro %}
a user would then be able to run something like this in the cloud IDE to generate a more comprehensive source yaml: {{ codegen.generate_source('tpch_sf001', name = 'tpch', database_name = 'raw', generate_columns = True, include_descriptions = True) }}
Who will this benefit?
This will benefit anyone setting up new sources for the first time in their dbt project and encourage those users to input descriptions at the source and table levels, improving their documentation. It will also eliminate confusion when a user provides the include_descriptions = True parameter without the generate_columns = True.
Currently, the following command: {{ codegen.generate_source('tpch_sf001', database_name = 'raw', include_descriptions = True) }}
generates a yaml with no descriptions at all:
As a user, i would expect this to still generate descriptions at the name/source and table level.
Are you interested in contributing this feature?
Yes, I would love to contribute to this feature! I have some working code locally, but would appreciate a hand getting this into the dbt-codegen repo the right way!
The text was updated successfully, but these errors were encountered:
Describe the feature
Currently, generate_source has an option to set include_descriptions = True, but this parameters only includes descriptions at the column level. Ideally, description placeholders would also be generated for the source and tables as well. Additionally, the required parameter for generate_source macro is the schema name, but there is no option to input a name value. It is possible that a user would like to name their source a different name from the schema name.
Describe alternatives you've considered
I can manually update the yaml that is generated from the current generate_source macro, but this is time consuming and prone to yaml formatting issues.
Additional context
I have working code in my own dbt project that I believe solves for both the descriptions at the source/table level, as well as the source name <> schema name. See below.
a user would then be able to run something like this in the cloud IDE to generate a more comprehensive source yaml:
{{ codegen.generate_source('tpch_sf001', name = 'tpch', database_name = 'raw', generate_columns = True, include_descriptions = True) }}
Who will this benefit?
This will benefit anyone setting up new sources for the first time in their dbt project and encourage those users to input descriptions at the source and table levels, improving their documentation. It will also eliminate confusion when a user provides the include_descriptions = True parameter without the generate_columns = True.
Currently, the following command:
{{ codegen.generate_source('tpch_sf001', database_name = 'raw', include_descriptions = True) }}
generates a yaml with no descriptions at all:
As a user, i would expect this to still generate descriptions at the name/source and table level.
Are you interested in contributing this feature?
Yes, I would love to contribute to this feature! I have some working code locally, but would appreciate a hand getting this into the dbt-codegen repo the right way!
The text was updated successfully, but these errors were encountered: