Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable & pass dbt-adapter-tests #16

Merged
merged 19 commits into from
Sep 25, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 12 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,17 +12,16 @@ Passing all tests in [dbt-integration-tests](https://github.com/fishtown-analyti
- must first create `EXTERNAL DATA SOURCE` and `EXTERNAL FILE FORMAT`s.

## status & support
as of now, only support for dbt `0.18.0`, support for forthcoming `0.18.0` in development
as of now, only support for dbt `0.18.0`

Passing all tests in [dbt-integration-tests](https://github.com/fishtown-analytics/dbt-integration-tests/).
Passing all tests in [dbt-adapter-tests](https://github.com/fishtown-analytics/dbt-adapter-tests), except `test_dbt_ephemeral_data_tests`

### outstanding work:
- switch to [`dbt-adapter-tests`](https://github.com/fishtown-analytics/dbt-adapter-tests)
- test incremental materializations more thoroughly than is done with [`dbt-integration-tests`](https://github.com/fishtown-analytics/dbt-integration-tests/).
- Add support for `ActiveDirectoryMsi`
- `ephemeral` materializations (workaround for non-recursive CTEs)
- auto-create `EXTERNAL DATA SOURCE` and `EXTERNAL FILE FORMAT`s.
- [officially rename the adapter from `sqlserver` to `synapse`](https://github.com/swanderz/dbt-synapse/pull/6)
- Use CTAS to create seeds?
- Add support for `ActiveDirectoryMsi`

## Installation
Easiest install is to use pip (not yet registered on PyPI).
Expand Down Expand Up @@ -145,6 +144,14 @@ sources:
```
## Changelog

### v0.18.0.1
- pull AD auth directly from `dbt-sqlserver` (https://github.com/swanderz/dbt-synapse/pull/13)
- hotfix for broken `create_view()` macro (https://github.com/swanderz/dbt-synapse/pull/14)
- get `dbt-adapter-tests` up and running (https://github.com/swanderz/dbt-synapse/pull/16)
- make `sqlserver__drop_schema()` also drop all tables and views associated with schema
- introduce `sqlserver__get_columns_in_query()` for use with testing
- align macro args with `dbt-base`

### v0.18.0rc2

#### Fixes:
Expand Down
92 changes: 92 additions & 0 deletions dbt/adapters/sqlserver/impl.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,11 @@
from dbt.adapters.sql import SQLAdapter
from dbt.adapters.sqlserver import SQLServerConnectionManager
from dbt.adapters.base.relation import BaseRelation
import agate
from typing import (
Optional, Tuple, Callable, Iterable, Type, Dict, Any, List, Mapping,
Iterator, Union, Set
)


class SQLServerAdapter(SQLAdapter):
Expand Down Expand Up @@ -34,3 +39,90 @@ def convert_number_type(cls, agate_table, col_idx):
@classmethod
def convert_time_type(cls, agate_table, col_idx):
return "datetime"

# Methods used in adapter tests
def timestamp_add_sql(
self, add_to: str, number: int = 1, interval: str = "hour"
) -> str:
# note: 'interval' is not supported for T-SQL
# for backwards compatibility, we're compelled to set some sort of
# default. A lot of searching has lead me to believe that the
# '+ interval' syntax used in postgres/redshift is relatively common
# and might even be the SQL standard's intention.
return f"DATEADD({interval},{number},{add_to})"

def string_add_sql(
self, add_to: str, value: str, location='append',
) -> str:
"""
`+` is T-SQL's string concatenation operator
"""
if location == 'append':
return f"{add_to} + '{value}'"
elif location == 'prepend':
return f"'{value}' + {add_to}"
else:
raise RuntimeException(
f'Got an unexpected location value of "{location}"'
)

def get_rows_different_sql(
self,
relation_a: BaseRelation,
relation_b: BaseRelation,
column_names: Optional[List[str]] = None,
except_operator: str = "EXCEPT",
) -> str:

"""
note: using is not supported on Synapse so COLUMNS_EQUAL_SQL is adjsuted
Generate SQL for a query that returns a single row with a two
columns: the number of rows that are different between the two
relations and the number of mismatched rows.
"""
# This method only really exists for test reasons.
names: List[str]
if column_names is None:
columns = self.get_columns_in_relation(relation_a)
names = sorted((self.quote(c.name) for c in columns))
else:
names = sorted((self.quote(n) for n in column_names))
columns_csv = ", ".join(names)

sql = COLUMNS_EQUAL_SQL.format(
columns=columns_csv,
relation_a=str(relation_a),
relation_b=str(relation_b),
except_op=except_operator,
)

return sql


COLUMNS_EQUAL_SQL = """
with diff_count as (
SELECT
1 as id,
COUNT(*) as num_missing FROM (
(SELECT {columns} FROM {relation_a} {except_op}
SELECT {columns} FROM {relation_b})
UNION ALL
(SELECT {columns} FROM {relation_b} {except_op}
SELECT {columns} FROM {relation_a})
) as a
), table_a as (
SELECT COUNT(*) as num_rows FROM {relation_a}
), table_b as (
SELECT COUNT(*) as num_rows FROM {relation_b}
), row_count_diff as (
select
1 as id,
table_a.num_rows - table_b.num_rows as difference
from table_a, table_b
)
select
row_count_diff.difference as row_count_difference,
diff_count.num_missing as num_mismatched
from row_count_diff
join diff_count on row_count_diff.id = diff_count.id
""".strip()
31 changes: 27 additions & 4 deletions dbt/include/sqlserver/macros/adapters.sql
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,15 @@
information_schema
{%- endmacro %}

{% macro sqlserver__get_columns_in_query(select_sql) %}
{% call statement('get_columns_in_query', fetch_result=True, auto_begin=False) -%}
select TOP 0 * from (
{{ select_sql }}
) as __dbt_sbq
{% endcall %}
{{ return(load_result('get_columns_in_query').table.columns | map(attribute='name') | list) }}
{% endmacro %}
Comment on lines +5 to +12
Copy link
Contributor Author

@dataders dataders Oct 26, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mikaelene, this is one macro we had to overwrite. perhaps this is the error you were seeing?


{% macro sqlserver__list_relations_without_caching(schema_relation) %}
{% call statement('list_relations_without_caching', fetch_result=True) -%}
select
Expand Down Expand Up @@ -36,9 +45,24 @@
{% endcall %}
{% endmacro %}

{% macro sqlserver__drop_schema(database_name, schema_name) -%}
{% macro sqlserver__drop_schema(relation) -%}
{%- set tables_in_schema_query %}
SELECT TABLE_NAME FROM INFORMATION_SCHEMA.TABLES
WHERE TABLE_SCHEMA = '{{ relation.schema }}'
{% endset %}
{% set tables_to_drop = run_query(tables_in_schema_query).columns[0].values() %}
{% for table in tables_to_drop %}
{%- set schema_relation = adapter.get_relation(database=relation.database,
schema=relation.schema,
identifier=table) -%}
{% do drop_relation(schema_relation) %}
{%- endfor %}

{% call statement('drop_schema') -%}
drop schema if exists {{ relation.without_identifier().schema }}
IF EXISTS (SELECT * FROM sys.schemas WHERE name = '{{ relation.schema }}')
BEGIN
EXEC('DROP SCHEMA {{ relation.schema }}')
END
Comment on lines +48 to +65
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mikaelene here is the change we made that @jtcohen6 mentioned here dbt-msft/dbt-sqlserver#54 (comment).

In our adapters we never use drop_schema(), nor does dbt-integration-tests so we never new it was problematic.

{% endcall %}
{% endmacro %}

Expand Down Expand Up @@ -71,9 +95,8 @@
end
{% endmacro %}

{% macro sqlserver__check_schema_exists(database, schema) -%}
{% macro sqlserver__check_schema_exists(information_schema, schema) -%}
{% call statement('check_schema_exists', fetch_result=True, auto_begin=False) -%}
--USE {{ database_name }}
SELECT count(*) as schema_exist FROM sys.schemas WHERE name = '{{ schema }}'
{%- endcall %}
{{ return(load_result('check_schema_exists').table) }}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -24,8 +24,8 @@
{%- set strategy_name = config.get('strategy') -%}
{%- set unique_key = config.get('unique_key') %}

{% if not adapter.check_schema_exists(model.database, model.schema) %}
{% do create_schema(model.database, model.schema) %}
{% if not check_schema_exists(information_schema, model.schema,) %}
{% do create_schema(model) %}
{% endif %}

{% set target_relation_exists, target_relation = get_or_create_relation(
Expand Down
8 changes: 0 additions & 8 deletions dbtSynapse.yml

This file was deleted.

7 changes: 5 additions & 2 deletions requirements.txt
Original file line number Diff line number Diff line change
@@ -1,3 +1,6 @@
dbt-sqlserver==0.15.2
dbt==0.15.3
dbt-core~=0.18.0
pyodbc>=4.0.27
azure-identity>=1.4.0
black~=20.8b1
git+https://github.com/fishtown-analytics/dbt-adapter-tests.git
.
2 changes: 1 addition & 1 deletion setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@


package_name = "dbt-synapse"
package_version = "0.18.0rc3"
package_version = "0.18.0.1"
description = """A Azure Synapse adpter plugin for dbt (data build tool)"""

authors_list = ["Nandan Hegde", "Anders Swanson"]
Expand Down
20 changes: 20 additions & 0 deletions test/integration/sqlserver.dbtspec
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
target:
type: sqlserver
driver: "ODBC Driver 17 for SQL Server"
schema: "dbt_test_{{ var('_dbt_random_suffix') }}"
host: host
database: database
username: username
password: password
port: port
threads: 8
sequences:
test_dbt_empty: empty
test_dbt_base: base
test_dbt_ephemeral: ephemeral
test_dbt_incremental: incremental
test_dbt_snapshot_strategy_timestamp: snapshot_strategy_timestamp
test_dbt_snapshot_strategy_check_cols: snapshot_strategy_check_cols
test_dbt_data_test: data_test
test_dbt_schema_test: schema_test
# test_dbt_ephemeral_data_tests: data_test_ephemeral_models