Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Release] Version 0.8.1: Current_timestamp macro updates, pull request join fix #58

Merged
merged 6 commits into from
Oct 23, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 3 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -5,4 +5,6 @@ logs/
keyfile.json
.DS_Store
develop/
dbt_packages/
dbt_packages/
env/
package-lock.yml
16 changes: 15 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,18 @@
# dbt_github v0.8.1
This release contains the following updates:

## Bug Fixes
- Replaced the deprecated `dbt_current_timestamp.backcompat()` macro with the up-to-date `dbt_current_timestamp`. The deprecated macro occasionally returned the system timezone instead of the expected UTC timestamp, leading to incorrect downstream metrics like negative values for `days_issue_open`. [PR #58](https://github.com/fivetran/dbt_github/pull/58)
- Updated the join type in `int_github__pull_request_times` to ensure pull requests without explicitly requested reviewers are no longer dropped. [PR #57](https://github.com/fivetran/dbt_github/pull/57)

## Under the Hood:
- Added consistency tests for `github__issues` and `github__pull_requests` to ensure new changes don't change the output of either model. (Some measures are omitted from the comparison tests, since they're measures based on the `current_timestamp`, which differs between validation test runs). [PR #58](https://github.com/fivetran/dbt_github/pull/58)

## Contributors
- [@samkessaram](https://github.com/samkessaram) ([PR #57](https://github.com/fivetran/dbt_github/pull/57))

# dbt_github v0.8.0
[PR #53](https://github.com/fivetran/dbt_jira/dbt_github/53) contains the following updates:
[PR #53](https://github.com/fivetran/dbt_github/pull/53) contains the following updates:

## 🚨 Breaking Change 🚨
- For consistency with other Fivetran packages, added default target schemas in `dbt_project.yml`. This is a breaking change since the model outputs will now be stored in a schema called `<your target schema>_github` by default. You will need to update any of your downstream use cases to point to the new schema.
Expand Down Expand Up @@ -80,6 +93,7 @@

## Contributors
- [@jackiexsun](https://github.com/jackiexsun) ([#31](https://github.com/fivetran/dbt_github/pull/31))

# dbt_github v0.5.0
## 🚨 Breaking Changes 🚨
- The addition of the `label` source model results in the reference within `int_github__issue_label` to break. As a result, with the addition of upstream changes within `dbt_github_source` and the new `int_github__issue_label_join` model this issue has been resolved. ([#26](https://github.com/fivetran/dbt_github/pull/26))
Expand Down
2 changes: 1 addition & 1 deletion dbt_project.yml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
config-version: 2
name: 'github'
version: '0.8.0'
version: '0.8.1'
require-dbt-version: [">=1.3.0", "<2.0.0"]
models:
github:
Expand Down
2 changes: 1 addition & 1 deletion docs/catalog.json

Large diffs are not rendered by default.

37 changes: 5 additions & 32 deletions docs/index.html

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion docs/manifest.json

Large diffs are not rendered by default.

1 change: 0 additions & 1 deletion docs/run_results.json

This file was deleted.

10 changes: 5 additions & 5 deletions integration_tests/ci/sample.profiles.yml
Original file line number Diff line number Diff line change
Expand Up @@ -16,13 +16,13 @@ integration_tests:
pass: "{{ env_var('CI_REDSHIFT_DBT_PASS') }}"
dbname: "{{ env_var('CI_REDSHIFT_DBT_DBNAME') }}"
port: 5439
schema: github_integration_tests
schema: github_integration_tests_1
threads: 8
bigquery:
type: bigquery
method: service-account-json
project: 'dbt-package-testing'
schema: github_integration_tests
schema: github_integration_tests_1
threads: 8
keyfile_json: "{{ env_var('GCLOUD_SERVICE_KEY') | as_native }}"
snowflake:
Expand All @@ -33,7 +33,7 @@ integration_tests:
role: "{{ env_var('CI_SNOWFLAKE_DBT_ROLE') }}"
database: "{{ env_var('CI_SNOWFLAKE_DBT_DATABASE') }}"
warehouse: "{{ env_var('CI_SNOWFLAKE_DBT_WAREHOUSE') }}"
schema: github_integration_tests
schema: github_integration_tests_1
threads: 8
postgres:
type: postgres
Expand All @@ -42,13 +42,13 @@ integration_tests:
pass: "{{ env_var('CI_POSTGRES_DBT_PASS') }}"
dbname: "{{ env_var('CI_POSTGRES_DBT_DBNAME') }}"
port: 5432
schema: github_integration_tests
schema: github_integration_tests_1
threads: 8
databricks:
catalog: null
host: "{{ env_var('CI_DATABRICKS_DBT_HOST') }}"
http_path: "{{ env_var('CI_DATABRICKS_DBT_HTTP_PATH') }}"
schema: github_integration_tests
schema: github_integration_tests_1
threads: 2
token: "{{ env_var('CI_DATABRICKS_DBT_TOKEN') }}"
type: databricks
8 changes: 6 additions & 2 deletions integration_tests/dbt_project.yml
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
name: 'github_integration_tests'
version: '0.8.0'
version: '0.8.1'
config-version: 2
profile: 'integration_tests'
vars:
github_source:
github_schema: github_integration_tests
github_schema: github_integration_tests_1
github_issue_assignee_identifier: "github_issue_assignee_data"
github_issue_closed_history_identifier: "github_issue_closed_history_data"
github_issue_comment_identifier: "github_issue_comment_data"
Expand All @@ -19,6 +19,10 @@ vars:
github_requested_reviewer_history_identifier: "github_requested_reviewer_history_data"
github_team_identifier: "github_team_data"
github_user_identifier: "github_user_data"

models:
+schema: "github_{{ var('directed_schema','dev') }}"

seeds:
github_integration_tests:
+quote_columns: "{{ true if target.type == 'redshift' else false }}"
Expand Down
2 changes: 1 addition & 1 deletion integration_tests/packages.yml
Original file line number Diff line number Diff line change
@@ -1,2 +1,2 @@
packages:
- local: ../
- local: ../
48 changes: 48 additions & 0 deletions integration_tests/tests/consistency_issues.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
{{ config(
tags="fivetran_validations",
enabled=var('fivetran_validation_tests_enabled', false)
) }}

-- this test ensures the github__issues end model matches the prior version
with prod as (
select * except(days_issue_open, labels, repository_team_names, assignees) --the differences in prod/dev run times will lead to discrepancies because it leverages current_timestamp, and string aggs don't always have the same order
from {{ target.schema }}_github_prod.github__issues
where date(updated_at) < date({{ dbt.current_timestamp() }})
),

dev as (
select * except(days_issue_open, labels, repository_team_names, assignees) --the differences in prod/dev run times will lead to slight discrepancies because it leverages current_timestamp, and string aggs don't always order the same
from {{ target.schema }}_github_dev.github__issues
where date(updated_at) < date({{ dbt.current_timestamp() }})
),

prod_not_in_dev as (
-- rows from prod not found in dev
select * from prod
except distinct
select * from dev
),

dev_not_in_prod as (
-- rows from dev not found in prod
select * from dev
except distinct
select * from prod
),

final as (
select
*,
'from prod' as source
from prod_not_in_dev

union all -- union since we only care if rows are produced

select
*,
'from dev' as source
from dev_not_in_prod
)

select *
from final
33 changes: 33 additions & 0 deletions integration_tests/tests/consistency_issues_counts.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
{{ config(
tags="fivetran_validations",
enabled=var('fivetran_validation_tests_enabled', false)
) }}

with prod as (
select
1 as join_key,
count(*) as total_prod_pr_count
from {{ target.schema }}_github_prod.github__issues
group by 1
),

dev as (
select
1 as join_key,
count(*) as total_dev_pr_count
from {{ target.schema }}_github_dev.github__issues
group by 1
),

final as (
select
total_prod_pr_count,
total_dev_pr_count
from prod
full outer join dev
on dev.join_key = prod.join_key
)

select *
from final
where total_dev_pr_count != total_prod_pr_count
48 changes: 48 additions & 0 deletions integration_tests/tests/consistency_pull_requests.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
{{ config(
tags="fivetran_validations",
enabled=var('fivetran_validation_tests_enabled', false)
) }}

-- this test ensures the github__pull_requests end model matches the prior version
with prod as (
select * except(days_issue_open, hours_request_review_to_first_review, hours_request_review_to_first_action, hours_request_review_to_merge) --the differences in prod/dev run times will lead to discrepancies because these fields leverages current_timestamp
from {{ target.schema }}_github_prod.github__pull_requests
where date(updated_at) < date({{ dbt.current_timestamp() }})
),

dev as (
select * except(days_issue_open, hours_request_review_to_first_review, hours_request_review_to_first_action, hours_request_review_to_merge) --the differences in prod/dev run times will lead to discrepancies because these fields leverages current_timestamp
from {{ target.schema }}_github_prod.github__pull_requests
where date(updated_at) < date({{ dbt.current_timestamp() }})
),

prod_not_in_dev as (
-- rows from prod not found in dev
select * from prod
except distinct
select * from dev
),

dev_not_in_prod as (
-- rows from dev not found in prod
select * from dev
except distinct
select * from prod
),

final as (
select
*,
'from prod' as source
from prod_not_in_dev

union all -- union since we only care if rows are produced

select
*,
'from dev' as source
from dev_not_in_prod
)

select *
from final
33 changes: 33 additions & 0 deletions integration_tests/tests/consistency_pull_requests_counts.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
{{ config(
tags="fivetran_validations",
enabled=var('fivetran_validation_tests_enabled', false)
) }}

with prod as (
select
1 as join_key,
count(*) as total_prod_pr_count
from {{ target.schema }}_github_prod.github__pull_requests
group by 1
),

dev as (
select
1 as join_key,
count(*) as total_dev_pr_count
from {{ target.schema }}_github_dev.github__pull_requests
group by 1
),

final as (
select
total_prod_pr_count,
total_dev_pr_count
from prod
full outer join dev
on dev.join_key = prod.join_key
)

select *
from final
where total_dev_pr_count != total_prod_pr_count
2 changes: 1 addition & 1 deletion models/intermediate/int_github__issue_open_length.sql
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ close_events_with_timestamps as (
select
issue_id,
updated_at as valid_starting,
coalesce(lead(updated_at) over (partition by issue_id order by updated_at), {{ dbt.current_timestamp_backcompat() }}) as valid_until,
coalesce(lead(updated_at) over (partition by issue_id order by updated_at), {{ dbt.current_timestamp() }}) as valid_until,
is_closed
from close_events_stacked
)
Expand Down
8 changes: 4 additions & 4 deletions models/intermediate/int_github__pull_request_times.sql
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@ first_request_time as (
min(requested_reviewer_history.created_at) as time_of_first_request,
min(pull_request_review.submitted_at) as time_of_first_review_post_request
from pull_request
join requested_reviewer_history on requested_reviewer_history.pull_request_id = pull_request.pull_request_id
left join requested_reviewer_history on requested_reviewer_history.pull_request_id = pull_request.pull_request_id
left join pull_request_review on pull_request_review.pull_request_id = pull_request.pull_request_id
and pull_request_review.submitted_at > requested_reviewer_history.created_at
group by 1, 2
Expand All @@ -50,14 +50,14 @@ select
issue_merged.merged_at,
{{ dbt.datediff(
'time_of_first_request',
"coalesce(time_of_first_review_post_request, " ~ dbt.current_timestamp_backcompat() ~ ")",
"coalesce(time_of_first_review_post_request, " ~ dbt.current_timestamp() ~ ")",
'second')
}}/ 60/60 as hours_request_review_to_first_review,
{{ dbt.datediff(
'time_of_first_request',
"least(
coalesce(time_of_first_requested_reviewer_review, " ~ dbt.current_timestamp_backcompat() ~ "),
coalesce(issue.closed_at, " ~ dbt.current_timestamp_backcompat() ~ "))",
coalesce(time_of_first_requested_reviewer_review, " ~ dbt.current_timestamp() ~ "),
coalesce(issue.closed_at, " ~ dbt.current_timestamp() ~ "))",
'second')
}} / 60/60 as hours_request_review_to_first_action,
{{ dbt.datediff('time_of_first_request', 'merged_at', 'second') }}/ 60/60 as hours_request_review_to_merge
Expand Down