Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge dev/0.6.5 branch into master ahead of release #369

Merged
merged 15 commits into from
May 18, 2021
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Add sequential_values schema test (#318)
  • Loading branch information
Claire Carroll authored and clrcrl committed May 18, 2021
commit 8efc00ba192a985c0e0e1d402a70ccc16c60ad31
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -4,6 +4,7 @@
* Make `expression_is_true` work as a column test (code originally in [#226](https://github.com/fishtown-analytics/dbt-utils/pull/226/) from [@elliottohara](https://github.com/elliottohara), merged via [#313])
* Add new schema test, `not_accepted_values` ([#284](https://github.com/fishtown-analytics/dbt-utils/pull/284) [@JavierMonton](https://github.com/JavierMonton))
* Support a new argument, `zero_length_range_allowed` in the `mutually_exclusive_ranges` test ([#307](https://github.com/fishtown-analytics/dbt-utils/pull/307) [@zemekeng](https://github.com/zemekeneng))
* Add new schema test, `sequential_values` ([#318](https://github.com/fishtown-analytics/dbt-utils/pull/318), inspired by [@hundredwatt](https://github.com/hundredwatt))


## Fixes
28 changes: 28 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -447,6 +447,34 @@ Here are a number of examples for each allowed `zero_length_range_allowed` param
| 2 | 2 |
| 3 | 4 |

#### sequential_values ([source](macros/schema_tests/sequential_values.sql))
This test confirms that a column contains sequential values. It can be used
for both numeric values, and datetime values, as follows:
```yml
version: 2

seeds:
- name: util_even_numbers
columns:
- name: i
tests:
- dbt_utils.sequential_values:
interval: 2


- name: util_hours
columns:
- name: date_hour
tests:
- dbt_utils.sequential_values:
interval: 1
datepart: 'hour'
```

**Args:**
* `interval` (default=1): The gap between two sequential values
* `datepart` (default=None): Used when the gaps are a unit of time. If omitted, the test will check for a numeric gap.

#### unique_combination_of_columns ([source](macros/schema_tests/unique_combination_of_columns.sql))
This test confirms that the combination of columns is unique. For example, the
combination of month and product is unique, however neither column is unique
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
my_timestamp
2021-01-01 00:00
2021-01-01 01:00
2021-01-01 02:00
2021-01-01 03:00
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
my_even_sequence
2
4
6
8
10
18 changes: 18 additions & 0 deletions integration_tests/data/schema_tests/schema.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
version: 2

seeds:
- name: data_test_sequential_values
columns:
- name: my_even_sequence
tests:
- dbt_utils.sequential_values:
interval: 2


- name: data_test_sequential_timestamps
columns:
- name: my_timestamp
tests:
- dbt_utils.sequential_values:
interval: 1
datepart: 'hour'
7 changes: 6 additions & 1 deletion integration_tests/dbt_project.yml
Original file line number Diff line number Diff line change
@@ -52,4 +52,9 @@ seeds:

sql:
data_events_20180103:
+schema: events
+schema: events

schema_tests:
data_test_sequential_timestamps:
+column_types:
my_timestamp: timestamp
34 changes: 34 additions & 0 deletions macros/schema_tests/sequential_values.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
{% macro test_sequential_values(model, column_name, interval=1, datepart=None) %}

{{ return(adapter.dispatch('test_sequential_values', packages=dbt_utils._get_utils_namespaces())(model, column_name, interval, datepart, **kwargs)) }}

{% endmacro %}

{% macro default__test_sequential_values(model, column_name, interval, datepart) %}

with windowed as (

select
{{ column_name }},
lag({{ column_name }}) over (
order by {{ column_name }}
) as previous_{{ column_name }}
from {{ model }}
),

validation_errors as (
select
*
from windowed
{% if datepart %}
where not(cast({{ column_name }} as timestamp)= cast({{ dbt_utils.dateadd(datepart, interval, 'previous_' + column_name) }} as timestamp))
{% else %}
where not({{ column_name }} = previous_{{ column_name }} + {{ interval }})
{% endif %}
)

select
count(*)
from validation_errors

{% endmacro %}