Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

accepted_values test doesn't work in all database engines #3905

Closed
1 of 5 tasks
sdebruyn opened this issue Sep 17, 2021 · 2 comments · Fixed by #3906
Closed
1 of 5 tasks

accepted_values test doesn't work in all database engines #3905

sdebruyn opened this issue Sep 17, 2021 · 2 comments · Fixed by #3906
Labels
bug Something isn't working

Comments

@sdebruyn
Copy link
Contributor

Describe the bug

The accepted_values test produces SQL with group by 1. The group by index syntax is not ANSI SQL and not supported in all database engines. This should be avoided in shared code like tests.

Steps To Reproduce

Put an accepted_values test in your code and connect to a MS SQL database.

Expected behavior

Test checks for the values.

System information

Which database are you using dbt with?

  • postgres
  • redshift
  • bigquery
  • snowflake
  • other (specify: MS SQL)

The output of dbt --version:

dbt --version
installed version: 0.20.2
   latest version: 0.20.2

Up to date!

Plugins:
  - bigquery: 0.20.2
  - snowflake: 0.20.2
  - redshift: 0.20.2
  - postgres: 0.20.2
  - sqlserver: 0.20.1
  - synapse: 0.20.0

The operating system you're using: macOS

The output of python --version: Python 3.9.7

@sdebruyn sdebruyn added bug Something isn't working triage labels Sep 17, 2021
@jtcohen6
Copy link
Contributor

@sdebruyn Thanks for opening!

The best solution for this is dispatch, which allows database plugins to reimplement macros (such as test_accepted_values, called by the built-in test) in a way that suits their SQL syntax.

That said, I agree that we should work toward ANSI SQL support for the default__ versions of macros where it's both possible and easy to do. This feels like one such case.

@artamoshin
Copy link

I faced with a same issue with ClickHouse, so I checked some DBMS for "group by" syntax:

create table sample_table (
    sample_column int
);

select
    sample_column as value_field,
    count(*) as n_records
from sample_table
group by 1;
DBMS group by 1 group by value_field group by {{ column_name }}
Postgres ✔️ ✔️ ✔️
Redshift ✔️ ✔️ ✔️
BigQuery ✔️ ✔️ ✔️
Snowflake ✔️ ✔️ ✔️
Databricks ✔️ ✔️ ✔️
Presto ✔️ ✔️
SQL Server ✔️
Exasol ✔️ ✔️
Oracle ✔️
Dremio ✔️ ✔️ ✔️
ClickHouse ✔️ ✔️
Greenplum ✔️ ✔️ ✔️
Vertica ✔️ ✔️ ✔️
MySQL ✔️ ✔️ ✔️
SQLite ✔️ ✔️ ✔️
Presto: Query 20210929_223604_00017_agqds failed: line 1:87: Column 'value_field' cannot be resolved
SQL Server: [S0001][164] Each GROUP BY expression must contain at least one column that is not an outer reference.
SQL Server: [S0001][207] Invalid column name 'value_field'.
Exasol: [42000] object VALUE_FIELD not found [line 5, column 10] (Session: 1712278724803428352)
Oracle: ORA-00979: not a GROUP BY expression
Oracle: ORA-00904: "VALUE_FIELD": invalid identifier
ClickHouse: Code: 215. DB::Exception: Column `sample_column` is not under aggregate function and not in GROUP BY: While processing sample_column AS value_field, count() AS n_records. (NOT_AN_AGGREGATE) (version 21.9.4.35 (official build))

As you see, group by {{ column_name }} is most universal.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants