-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Release] Version 0.8.1: Current_timestamp macro updates, pull request join fix #58
Conversation
update int_github__pull_request_times to not drop pull requests witho…
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@fivetran-avinash these changes look good to me! I have a small suggestion to add more details to a CHANGELOG entry as well as adding two new aggregate tests for the end models impacted by these changes. I already tested the row counts locally and they passed so no need to block approval before those updates are applied.
Once all updates are applied this will be ready for release review.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we also add a test that explicitly checks the counts (and in the future other aggregates) of the impacted end model of this PR. For example, I would want to make sure that going forward the total row count matches between dev and prod. Something like the following would be great.
{{ config(
tags="fivetran_validations",
enabled=var('fivetran_validation_tests_enabled', false)
) }}
with prod as (
select
1 as join_key,
count(*) as total_prod_pr_count
from {{ target.schema }}_github_prod.github__pull_requests
group by 1
),
dev as (
select
1 as join_key,
count(*) as total_dev_pr_count
from {{ target.schema }}_github_dev.github__pull_requests
group by 1
),
final as (
select
total_prod_pr_count,
total_dev_pr_count
from prod
full outer join dev
on dev.join_key = prod.join_key
)
select *
from final
where total_dev_pr_count != total_prod_pr_count
As we found in the past the prod_not_in_dev and dev_not_in_prod approach doesn't always capture missing records as well as it does different records. So this added test should help close that gap!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A similar aggregate test for the issue table would be great to add as well.
CHANGELOG.md
Outdated
[PR #58](https://github.com/fivetran/dbt_github/pull/58) contains the following updates: | ||
|
||
## Bug Fixes | ||
- Replaced the existing `dbt_current_timestamp.backcompat()` with the up-to-date `dbt_current_timestamp` macro to generate the current timestamp for customers. [PR #58](https://github.com/fivetran/dbt_github/pull/58) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add a bit more to this entry explaining why this was an issue in the past and how this updated addresses that bug.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@fivetran-avinash Left a couple suggestions!
CHANGELOG.md
Outdated
This release contains the following updates: | ||
|
||
## Bug Fixes | ||
- Replaced the existing `dbt_current_timestamp.backcompat()` with the up-to-date `dbt_current_timestamp` macro. The existing macro was not always returning the system timezone timestamp rather than the expected UTC value, causing negative downstream measures for open issues like `days_issue_open`. [PR #58](https://github.com/fivetran/dbt_github/pull/58) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a little bit hard to understand. Maybe try:
- Replaced the existing `dbt_current_timestamp.backcompat()` with the up-to-date `dbt_current_timestamp` macro. The existing macro was not always returning the system timezone timestamp rather than the expected UTC value, causing negative downstream measures for open issues like `days_issue_open`. [PR #58](https://github.com/fivetran/dbt_github/pull/58) | |
- Replaced the deprecated `dbt_current_timestamp.backcompat()` macro with the up-to-date `dbt_current_timestamp`. The deprecated macro occasionally returned the system timezone instead of the expected UTC timestamp, leading to incorrect downstream metrics like negative values for `days_issue_open`. [PR #58](https://github.com/fivetran/dbt_github/pull/58) |
CHANGELOG.md
Outdated
|
||
## Bug Fixes | ||
- Replaced the existing `dbt_current_timestamp.backcompat()` with the up-to-date `dbt_current_timestamp` macro. The existing macro was not always returning the system timezone timestamp rather than the expected UTC value, causing negative downstream measures for open issues like `days_issue_open`. [PR #58](https://github.com/fivetran/dbt_github/pull/58) | ||
- Updated the join type in `int_github__pull_request_times` to not drop pull requests without explicit reviewers requested. [PR #57](https://github.com/fivetran/dbt_github/pull/57) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Updated the join type in `int_github__pull_request_times` to not drop pull requests without explicit reviewers requested. [PR #57](https://github.com/fivetran/dbt_github/pull/57) | |
- Updated the join type in `int_github__pull_request_times` to ensure pull requests without explicitly requested reviewers are no longer dropped. | |
[PR #57](https://github.com/fivetran/dbt_github/pull/57) |
CHANGELOG.md
Outdated
- Updated the join type in `int_github__pull_request_times` to not drop pull requests without explicit reviewers requested. [PR #57](https://github.com/fivetran/dbt_github/pull/57) | ||
|
||
## Under the Hood: | ||
- Added consistency tests for `github__issues` and `github__pull_requests` to ensure new changes don't change the output of either model. There are conditions. [PR #58](https://github.com/fivetran/dbt_github/pull/58) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are conditions.
Reads a bit cryptic 😆. Can you expand on what this means?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh god. Fixing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@fivetran-catfritz release review notes addressed!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome! Lgtm.
PR Overview
This PR will address the following Issue/Feature: [#56], plus current timestamp issue reported in Databricks from a customer.
This PR will result in the following new package version: 0.8.1
No fields changed, just underlying logic.
Please provide the finalized CHANGELOG entry which details the relevant changes included in this PR:
Bug Fixes
dbt_current_timestamp.backcompat()
with the up-to-datedbt_current_timestamp
macro to generate the current timestamp for customers. PR #58int_github__pull_request_times
to not drop pull requests without explicit reviewers requested. PR #57Under the Hood:
github__issues
andgithub__pull_requests
to ensure new changes don't change the output of either model. There are conditions. PR #58Contributors
PR Checklist
Basic Validation
Please acknowledge that you have successfully performed the following commands locally:
Before marking this PR as "ready for review" the following have been applied:
Detailed Validation
Please share any and all of your validation steps:
Validated that the new
dbt.current_timestamp
macro function in Databricks (which was where the customer reported the error) works with UTC. This was taken at 15:13 UTC time:Validation tests were checked on development data and integration tests.
Note: Some fields in the row changes were commented out, as they relied on the current timestamp.
Also string aggregated fields don't always have the same order, so those fields were also excluded from the row comp.
If you had to summarize this PR in an emoji, which would it be?
🐛