Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updated generate_model_yaml macro to correctly handle nested bigquery… #54

Merged
merged 5 commits into from
Jun 23, 2022

Conversation

bodschut
Copy link
Contributor

@bodschut bodschut commented Feb 2, 2022

This is a:

  • bug fix PR with no breaking changes — please ensure the base branch is main
  • new functionality — please ensure the base branch is the latest dev/ branch
  • a breaking change — please ensure the base branch is the latest dev/ branch

Description & motivation

Fixes #27

Checklist

  • I have verified that these changes work locally
  • I have updated the README.md (if applicable)
  • I have added tests & descriptions to my models (and macros if applicable)
  • I have added an entry to CHANGELOG.md

@joellabes
Copy link
Contributor

joellabes commented Mar 22, 2022

Hi @bodschut, thanks for opening this - as you noted in #27, we'll definitely want some integration tests to ensure everything's working correctly.

Have a look at https://github.com/dbt-labs/dbt-codegen/tree/main/integration_tests/tests for some examples - you can create another source table following this pattern - you'll need to only run it if the current adapter is BigQuery of course (see target.type)

Copy link
Contributor

@dbeatty10 dbeatty10 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bodschut thanks for submitting this! 🏆

I added some tests and everything looks good ✅

@dbeatty10 dbeatty10 merged commit 40c77d4 into dbt-labs:main Jun 23, 2022
{% set column_name = column.name %}
{% endif %}

{% do model_yaml.append(' - name: ' ~ column.name | lower ) %}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for poking an old PR but shouldn't this line read(?):

{% do model_yaml.append('      - name: ' ~ column_name  | lower ) %}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure one way or the other @Zatte

@bodschut Some questions:

  1. What do you think of @Zatte 's question about column_name versus column.name?
  2. Is there something more descriptive than column_name that would be accurate, like aliased_parent_column_name?
  3. Is parent_column_name even used as a parameter? It's not explicitly set here, and the generate_column_yaml looks like a "private" macro internal to codegen rather than a "public" macro intended to be used by consumers.

Copy link

@Zatte Zatte Dec 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For context; I use generate_model_yaml-source (which calls this private macro) on a nested struct in BigQuery

With the current state i only get "tail columns" as the column names (not the full parent hierarchy, dot delimited). I would expect the column names to contain the full parent hierarchy.

Example.
SQL

SELECT 
   struct(
      "a" as a, 
      "b" as b
   ) as substruct

Current output

models:
  - name: model_struct
    description: ""
    columns:
      - name: a
        description: ""

      - name: b
        description: ""

Expected output (with column_name change)

models:
  - name: model_struct
    description: ""
    columns:
      - name: substruct.a
        description: ""

      - name: substruct.b
        description: ""

The current output will be ambiguous in the case of multiple sub-structs (nested columns) with the same name like this.

SELECT 
   struct(
      "a1" as a, 
      "b1" as b
   ) as substruct1, 
   struct(
      "a2" as a, 
      "b2" as b
   ) as substruct2

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This info is perfect @Zatte !

I was able to reproduce and see that your suggested fix works.

Opened this issue and will follow-up with a PR:
#98

I took a peek at the commit history on this -- looks like @bodschut had it correct the first time around, and I accidentally foobar'd it when resolving a merge conflict 😬

jeremyholtzman pushed a commit that referenced this pull request Apr 10, 2023
#54)

* Updated generate_model_yaml macro to correctly handle nested bigquery fields

* Updated changelog.md

* Update macros/generate_model_yaml.sql

* Integration tests

Co-authored-by: Bob De Schutter <bob.deschutter@digitalswat.be>
Co-authored-by: Doug Beatty <44704949+dbeatty10@users.noreply.github.com>
Co-authored-by: Doug Beatty <doug.beatty@dbtlabs.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Generate_model_yaml doesn't recognize nested BigQuery columns
4 participants