Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes the issue running DBT against Panoply #1674

Conversation

gautam-ndk
Copy link

Panoply cleans up queries before executing. Apparently, they have a bug in the cleansing system which throws a syntax error because of the position of the comment.

Slightly changed the position and now I could dbt run against panoply.

in the cleansing system which throws a syntax error because of the
position of the comment.

Slightly changed the position and now I could `dbt run` against panoply.
@drewbanin
Copy link
Contributor

Thanks @gautam-ndk! This is a very strange Panoply issue to me... the "jinja comments" inside of {# ... #} should be compiled away, so these comments won't even be included in the query that gets sent to Panoply. I think the reason why this change has any effect on Panoply is because it changes the amount of whitespace before the select statement. Before, this SQL compiled to something like:


    select ...

whereas after this PR, it will be:

select ...

So, you know, that's bonkers :)

You mentioned in #1479 that you're still seeing syntax errors while executing dbt docs generate. Can you tell me what those errors say? I think it will be an equivalently tiny change, and I'd love to fix that one at the same time!

Thanks again for submitting this PR!

@gautam-ndk
Copy link
Author

Thanks for checking it out @drewbanin. Below is the exact error I see for docs generate

18:52:14 | Concurrency: 4 threads (target='panoply-dev')
18:52:14 | 
18:52:19 | Done.
18:52:23 | Building catalog
Encountered an error:
Database Error
  syntax error at or near "with"
  LINE 49:         table_catalog as table_database,
           ^

Below is the exact query as I see with -d option

   

    with late_binding as (
      select
        '<my_warehouse_name>'::varchar as table_database,
        table_schema,
        table_name,
        'LATE BINDING VIEW'::varchar as table_type,
        null::text as table_comment,

        column_name,
        column_index,
        column_type,
        null::text as column_comment
      from pg_get_late_binding_view_cols()
        cols(table_schema name, table_name name, column_name name,
             column_type varchar,
             column_index int)
        order by "column_index"
    ),

    table_owners as (

        select
            '<my_warehouse_name>'::varchar as table_database,
            schemaname as table_schema,
            tablename as table_name,
            tableowner as table_owner

        from pg_tables

        union all

        select
            '<my_warehouse_name>'::varchar as table_database,
            schemaname as table_schema,
            viewname as table_name,
            viewowner as table_owner

        from pg_views

    ),

    tables as (

      select
        table_catalog as table_database,
        table_schema,
        table_name,
        table_type

      from information_schema.tables

    ),

    columns as (

        select
            '<my_warehouse_name>'::varchar as table_database,
            table_schema,
            table_name,
            null::varchar as table_comment,

            column_name,
            ordinal_position as column_index,
            data_type as column_type,
            null::varchar as column_comment


        from information_schema.columns

    ),

    unioned as (

        select *
        from tables
        join columns using (table_database, table_schema, table_name)

        union all

        select *
        from late_binding

    )

    select *,
        table_database || '.' || table_schema || '.' || table_name as table_id

    from unioned
    join table_owners using (table_database, table_schema, table_name)

    where table_schema != 'information_schema'
      and table_schema not like 'pg_%'

    order by "column_index"

When I tried to run the above query in Panoply's console, it threw an error

WITH query name "columns" specified more than once

I do remember @kevinsanz93 mentioning in the Slack channel about this, so I renamed columns with column_names. So, the changed query is

   

    with late_binding as (
      select
        '<my_warehouse_name>'::varchar as table_database,
        table_schema,
        table_name,
        'LATE BINDING VIEW'::varchar as table_type,
        null::text as table_comment,

        column_name,
        column_index,
        column_type,
        null::text as column_comment
      from pg_get_late_binding_view_cols()
        cols(table_schema name, table_name name, column_name name,
             column_type varchar,
             column_index int)
        order by "column_index"
    ),

    table_owners as (

        select
            '<my_warehouse_name>'::varchar as table_database,
            schemaname as table_schema,
            tablename as table_name,
            tableowner as table_owner

        from pg_tables

        union all

        select
            '<my_warehouse_name>'::varchar as table_database,
            schemaname as table_schema,
            viewname as table_name,
            viewowner as table_owner

        from pg_views

    ),

    tables as (

      select
        table_catalog as table_database,
        table_schema,
        table_name,
        table_type

      from information_schema.tables

    ),

    column_names as (

        select
            '<my_warehouse_name>'::varchar as table_database,
            table_schema,
            table_name,
            null::varchar as table_comment,

            column_name,
            ordinal_position as column_index,
            data_type as column_type,
            null::varchar as column_comment


        from information_schema.columns

    ),

    unioned as (

        select *
        from tables
        join column_names using (table_database, table_schema, table_name)

        union all

        select *
        from late_binding

    )

    select *,
        table_database || '.' || table_schema || '.' || table_name as table_id

    from unioned
    join table_owners using (table_database, table_schema, table_name)

    where table_schema != 'information_schema'
      and table_schema not like 'pg_%'

    order by "column_index"

Now, the Panoply console accepts the above query. But, unfortunately dbt doc generate still fails with the same error as earlier.

Is there a way to get the exact query (including whitespaces?) that is sent to the Panoply

@drewbanin
Copy link
Contributor

Thanks for the additional info @gautam-ndk! Check out the logs/dbt.log file -- that should be very similar to running dbt with the -d option, but you'll be able to copy the exact whitespace that dbt uses out of this file.

I took a quick look at this yesterday and came up with the following fix. Do you see anything in here that might interesting to try adding to your branch? https://github.com/fishtown-analytics/dbt/compare/fix/panoply

@gautam-ndk
Copy link
Author

Hey @drewbanin,
I just installed dbt from your branch and both commands (dbt run & dbt docs generate) seem to be working fine! What could have been the issue?

Also, please go ahead and merge your branch. I will close this PR.

@drewbanin
Copy link
Contributor

@gautam-ndk you know, this is just a sort of a series of bugs on Panoply's end - I didn't do anything too scientific here! I just spun up a new Panoply Redshift database and tweaked dbt's internal queries until I could get a run to succeed :)

I found that:

  • i needed to quote the information_schema."columns" table
  • i needed to rename a CTE from columns to table_columns
  • we needed to change the placement of that comment in get_relations()

I guess Panoply might have some problem with identifiers named columns? Pretty strange!

@drewbanin
Copy link
Contributor

drewbanin commented Aug 16, 2019

re-opened in #1686

Thanks so much for your help here @gautam-ndk!!

@drewbanin drewbanin closed this Aug 16, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants