Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/historical schedules #171

Merged
merged 82 commits into from
Oct 10, 2024
Merged

Conversation

fivetran-catfritz
Copy link
Contributor

@fivetran-catfritz fivetran-catfritz commented Sep 11, 2024

PR Overview

This PR will address the following Issue/Feature:

This PR will result in the following new package version:

Please provide the finalized CHANGELOG entry which details the relevant changes included in this PR:

  • v0.18.0 - Addition of schedule history logic, which is enabled by default.

PR Checklist

Basic Validation

Please acknowledge that you have successfully performed the following commands locally:

  • dbt run –full-refresh && dbt test
  • dbt run (if incremental models are present) && dbt test

Before marking this PR as "ready for review" the following have been applied:

  • The appropriate issue has been linked, tagged, and properly assigned
  • All necessary documentation and version upgrades have been applied
  • docs were regenerated (unless this PR does not include any code or yml updates)
  • BuildKite integration tests are passing
  • Detailed validation steps have been provided below

Detailed Validation

Please share any and all of your validation steps:

  • See internal ticket

If you had to summarize this PR in an emoji, which would it be?

💃

Copy link
Contributor Author

@fivetran-catfritz fivetran-catfritz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fivetran-joemarkiewicz I have made the updates and added more documentation, so this is ready for re-review.

False as is_historical
from schedule

{# {% if var('using_schedule_histories', True) %} #}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was a leftover--now removed!

lower(time_zone) as time_zone,
schedule_name,
cast(null as date) as valid_from, -- created_at is when the schedule was first ever created, so we'll fill the real value later
cast({{ dbt.dateadd('year', 1, dbt.current_timestamp_backcompat()) }} as date) as valid_until,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's mainly for us or anyone looking at the schedule_spine. If you have holidays scheduled for the future but don't project the schedule into the future, only the timezone switches seems to show up. However as we discovered it was causing issues with schedule joins, so I put it back to the current timestamp.

select
schedule_id,
schedule_id_index,
cast(valid_from as date) as valid_from,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed.

row_number() over (
partition by schedule_id, cast(valid_from as date)
-- ordering to get the latest change when there are multiple on one day
order by valid_from desc, coalesce(valid_until, {{ dbt.current_timestamp_backcompat() }}) desc
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed to dbt.current_timestamp() and everything is working correct. @fivetran-jamie confirmed that this will ensure the timestamps are in UTC.

Comment on lines 93 to 109
-- now that the schedule changes are cleaned, we can split into the individual schedules periods
), split_days as (
{% set days_of_week = {'sun': 0, 'mon': 1, 'tue': 2, 'wed': 3, 'thu': 4, 'fri': 5, 'sat': 6} %}
{% for day, day_number in days_of_week.items() %}
select
schedule_id,
schedule_id_index,
valid_from,
valid_until,
schedule_change,
'{{ day }}' as day_of_week,
cast('{{ day_number }}' as {{ dbt.type_int() }}) as day_of_week_number,
{{ zendesk.regex_extract('schedule_change', day) }} as day_of_week_schedule
from consolidate_same_day_changes

{% if not loop.last %}union all{% endif %}
{% endfor %}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct, and added!

json_parse('[' || replace(replace(day_of_week_schedule, ', ', ','), ',', '},{') || ']') as json_schedule

from split_days
where day_of_week_schedule != '{}'
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It excludes days with no schedule. Updated the comment!

Comment on lines 83 to 91
select
schedule_id,
group_id,
schedule_change,
max(schedule_id_index) as schedule_id_index,
min(valid_from) as valid_from,
max(valid_until) as valid_until
from find_actual_changes
{{ dbt_utils.group_by(3) }}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added both!

Copy link
Contributor

@fivetran-joemarkiewicz fivetran-joemarkiewicz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fivetran-catfritz this PR is looking really good! I have a few comments and change requests below. Let me know if you have any questions.

Also to your point from what we discussed in person a few days ago - I see that the validation tests are failing for our consistency tests. However, when comparing with the Zendesk UI, I can see for the failing records that we are much more in line with the Zendesk reporting with your changes vs what is live in Prod. This is a great sign and proving these changes will help address the known issues as well as provide support for a much requested schedule history functionality!

.quickstart/quickstart.yml Show resolved Hide resolved
CHANGELOG.md Outdated Show resolved Hide resolved
dbt_project.yml Outdated Show resolved Hide resolved
DECISIONLOG.md Outdated Show resolved Hide resolved
CHANGELOG.md Show resolved Hide resolved
README.md Show resolved Hide resolved
models/intermediate/int_zendesk__schedule_holiday.sql Outdated Show resolved Hide resolved
lower(time_zone) as time_zone,
schedule_name,
cast(null as date) as valid_from, -- created_at is when the schedule was first ever created, so we'll fill this value later
cast({{ dbt.current_timestamp() }} as date) as valid_until,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know you mentioned this was intentionally moved to the current timestamp whereas it was looking forward a year before. However, now there are chances where the current schedule only is valid until the current week starting sunday. This means we could be causing tickets in the current week to be excluded from downstream calculations.

For example, this is the most recent record I see for the Central Timezone schedule on 10/9
image

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me know if this is intentional, or if we should consider looking forward a bit for applicable schedules.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good callout. Added a look forward of a week.

models/intermediate/int_zendesk__schedule_spine.sql Outdated Show resolved Hide resolved
Copy link
Contributor Author

@fivetran-catfritz fivetran-catfritz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fivetran-joemarkiewicz Thanks for the review! I have addressed the comments and is ready for re-review.

.quickstart/quickstart.yml Show resolved Hide resolved
CHANGELOG.md Show resolved Hide resolved
CHANGELOG.md Outdated Show resolved Hide resolved
DECISIONLOG.md Outdated Show resolved Hide resolved
README.md Show resolved Hide resolved
dbt_project.yml Outdated Show resolved Hide resolved
models/intermediate/int_zendesk__schedule_holiday.sql Outdated Show resolved Hide resolved
lower(time_zone) as time_zone,
schedule_name,
cast(null as date) as valid_from, -- created_at is when the schedule was first ever created, so we'll fill this value later
cast({{ dbt.current_timestamp() }} as date) as valid_until,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good callout. Added a look forward of a week.

models/intermediate/int_zendesk__schedule_spine.sql Outdated Show resolved Hide resolved
Copy link
Contributor

@fivetran-joemarkiewicz fivetran-joemarkiewicz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fivetran-catfritz great work on this PR! I'm looking forward to users being able to take advantage of the schedule history tracking as well as the overhaul to the schedule spine for more reliable reporting.

I have just a few final documentation requests, otherwise this is good to go to release review once those are applied!

CHANGELOG.md Outdated Show resolved Hide resolved
CHANGELOG.md Outdated Show resolved Hide resolved
CHANGELOG.md Outdated Show resolved Hide resolved
CHANGELOG.md Outdated
Comment on lines 22 to 25
- `int_zendesk__timezone_daylight`: A utility model that maintains a record of daylight savings adjustments for each time zone.
- `int_zendesk__schedule_history`: Captures a full history of schedule changes for each `schedule_id`.
- `int_zendesk__schedule_timezones`: Merges schedule history with time zone shifts.
- `int_zendesk__schedule_holidays`: Identifies and calculates holiday periods for each schedule.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you mention the materialization of each of these so the customer knows the impact (if any) on their destination.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated.

README.md Outdated
Comment on lines 82 to 83
### Step 4: Disable models for non-existent sources
This package takes into consideration that not every Zendesk Support account utilizes the `schedule`, `schedule_holiday`, `ticket_schedule` `daylight_time`, `time_zone`, `domain_name`, `user_tag`, `organization_tag`, or `ticket_form_history` features, and allows you to disable the corresponding functionality. By default, all variables' values are assumed to be `true`. Add variables for only the tables you want to disable:
### Step 4: Enable/Disable models for non-existent sources
This package takes into consideration that not every Zendesk Support account utilizes the `schedule`, `schedule_holiday`, `audit_log`, `domain_name`, `user_tag`, `organization_tag`, or `ticket_form_history` features, and allows you to disable the corresponding functionality. By default, all variables' values are assumed to be `true`, except for `using_schedule_histories`. Add variables for only the tables you want to enable/disable:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed you removed the ticket_schedule, daylight_time, and time_zone. However, these are all dependent on the using_schedules variable (reference). Can we still include these sources and maybe it would help to call them out in the comment of the variable shown in the example below.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated!

CHANGELOG.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
Co-authored-by: Joe Markiewicz <74217849+fivetran-joemarkiewicz@users.noreply.github.com>
@fivetran-catfritz
Copy link
Contributor Author

I have addressed the comments, so this is ready for release review.

Copy link
Contributor

@fivetran-jamie fivetran-jamie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks GREAT, left mostly doc related suggestions/comments.

.buildkite/scripts/run_models.sh Show resolved Hide resolved
Comment on lines 26 to 27
dbt test --target "$db"
dbt run --vars '{zendesk__unstructured_enabled: true, using_schedules: false, using_domain_names: false, using_user_tags: false, using_ticket_form_history: false, using_organization_tags: false}' --target "$db" --full-refresh
dbt test --target "$db"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we want to add this run back in? Otherwise can remove the second dbt test

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call. Removed.

CHANGELOG.md Outdated

## Breaking Changes (Full refresh required after upgrading)
### Schedule Change Support
- Support for schedule changes has been added. This feature is disabled by default, but can be enabled by setting variable `using_schedule_histories` to `true` in your `dbt_project.yml`:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we disabling this by default bc most customers don't have the audit_log table?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. I updated to explain that.

CHANGELOG.md Outdated
Comment on lines 15 to 17
### dbt_zendesk_source changes (see the [Release Notes](https://github.com/fivetran/dbt_zendesk_source/releases/tag/v0.13.0) for more details)
- Introduced the `stg_zendesk__audit_log` table for capturing schedule changes from Zendesk's audit log.
- This model is disabled by default, to enable it set variable `using_schedule_histories` to `true` in `dbt_project.yml`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This section's format looks a lil confusing with the header. Maybe making it a subsection of the Schedule Change Support section would make more sense

dbt_zendesk_source changes (see the Release Notes for more details)

  • Introduced the stg_zendesk__audit_log table for capturing schedule changes from Zendesk's audit log.
    • This model is disabled by default, to enable it set variable using_schedule_histories to true in dbt_project.yml.

And/or as an indented note

dbt_zendesk_source changes (see the Release Notes for more details)

  • Introduced the stg_zendesk__audit_log table for capturing schedule changes from Zendesk's audit log.
    • This model is disabled by default, to enable it set variable using_schedule_histories to true in dbt_project.yml.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated to subsection!

CHANGELOG.md Outdated
Comment on lines 31 to 33
### dbt_zendesk_source changes (see the [Release Notes](https://github.com/fivetran/dbt_zendesk_source/releases/tag/v0.13.0) for more details)
- Updated the `stg_zendesk__schedule_holidays` model to allow users to disable holiday processing by setting variable `using_holidays` to `false`.
- Added field-level documentation for the `stg_zendesk__audit_log` table.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need this here? I feel like we could do a "Refer to source package Release Notes for details on upstream changes" sorta thing

If yes, same comment about the formatting being a lil confusing and maybe adjusting it to be a subsection and/or indented block

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah let's remove that line. I made it a #### subsection instead of ### to match the prior seciont.

CHANGELOG.md Outdated
Comment on lines 22 to 29
- `int_zendesk__timezone_daylight`: A utility model that maintains a record of daylight savings adjustments for each time zone.
- materialization: ephemeral
- `int_zendesk__schedule_history`: Captures a full history of schedule changes for each `schedule_id`.
- materialization: table (if enabled)
- `int_zendesk__schedule_timezones`: Merges schedule history with time zone shifts.
- materialization: ephemeral
- `int_zendesk__schedule_holidays`: Identifies and calculates holiday periods for each schedule.
- materialization: ephemeral
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could be good to link to the model files

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added.

CHANGELOG.md Outdated
- materialization: table (if enabled)
- `int_zendesk__schedule_timezones`: Merges schedule history with time zone shifts.
- materialization: ephemeral
- `int_zendesk__schedule_holidays`: Identifies and calculates holiday periods for each schedule.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- `int_zendesk__schedule_holidays`: Identifies and calculates holiday periods for each schedule.
- `int_zendesk__schedule_holiday`: Identifies and calculates holiday periods for each schedule.

Right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks updated!

case when change_type = 'holiday'
then ({{ dbt.datediff('holiday_starting_sunday', 'holiday_valid_until', 'minute') }}
+ 24 * 60 -- add 1 day to set the upper bound of the holiday
- offset_minutes)-- timezone adjustment
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- offset_minutes)-- timezone adjustment
- offset_minutes) -- timezone adjustment

I feel like I've seen this cause issues before

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh. Updated!

-- Fill all other normal schedules.
select
schedule_id,
-- CFount the number of records for each schedule start_time_utc and end_time_utc for filtering later.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What should CFount be?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hah oops. "Count"

CHANGELOG.md Show resolved Hide resolved
Copy link
Contributor Author

@fivetran-catfritz fivetran-catfritz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review @fivetran-jamie. I have addressed the comments!

.buildkite/scripts/run_models.sh Show resolved Hide resolved
Comment on lines 26 to 27
dbt test --target "$db"
dbt run --vars '{zendesk__unstructured_enabled: true, using_schedules: false, using_domain_names: false, using_user_tags: false, using_ticket_form_history: false, using_organization_tags: false}' --target "$db" --full-refresh
dbt test --target "$db"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call. Removed.

CHANGELOG.md Outdated

## Breaking Changes (Full refresh required after upgrading)
### Schedule Change Support
- Support for schedule changes has been added. This feature is disabled by default, but can be enabled by setting variable `using_schedule_histories` to `true` in your `dbt_project.yml`:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. I updated to explain that.

CHANGELOG.md Outdated
Comment on lines 15 to 17
### dbt_zendesk_source changes (see the [Release Notes](https://github.com/fivetran/dbt_zendesk_source/releases/tag/v0.13.0) for more details)
- Introduced the `stg_zendesk__audit_log` table for capturing schedule changes from Zendesk's audit log.
- This model is disabled by default, to enable it set variable `using_schedule_histories` to `true` in `dbt_project.yml`.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated to subsection!

CHANGELOG.md Outdated
- This model is disabled by default, to enable it set variable `using_schedule_histories` to `true` in `dbt_project.yml`.

## New Features
- Holiday support: Users can now choose to disable holiday tracking by setting variable `using_holidays` to `false` in `dbt_project.yml`.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. Updated.

CHANGELOG.md Outdated
- materialization: table (if enabled)
- `int_zendesk__schedule_timezones`: Merges schedule history with time zone shifts.
- materialization: ephemeral
- `int_zendesk__schedule_holidays`: Identifies and calculates holiday periods for each schedule.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks updated!

CHANGELOG.md Outdated
Comment on lines 22 to 29
- `int_zendesk__timezone_daylight`: A utility model that maintains a record of daylight savings adjustments for each time zone.
- materialization: ephemeral
- `int_zendesk__schedule_history`: Captures a full history of schedule changes for each `schedule_id`.
- materialization: table (if enabled)
- `int_zendesk__schedule_timezones`: Merges schedule history with time zone shifts.
- materialization: ephemeral
- `int_zendesk__schedule_holidays`: Identifies and calculates holiday periods for each schedule.
- materialization: ephemeral
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added.

CHANGELOG.md Outdated
Comment on lines 31 to 33
### dbt_zendesk_source changes (see the [Release Notes](https://github.com/fivetran/dbt_zendesk_source/releases/tag/v0.13.0) for more details)
- Updated the `stg_zendesk__schedule_holidays` model to allow users to disable holiday processing by setting variable `using_holidays` to `false`.
- Added field-level documentation for the `stg_zendesk__audit_log` table.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah let's remove that line. I made it a #### subsection instead of ### to match the prior seciont.

CHANGELOG.md Show resolved Hide resolved
README.md Outdated
```yml
vars:
using_schedules: False #Disable if you are not using schedules
using_schedule_histories: True #Enable if you are using audit_logs for schedule histories
using_schedules: False #Disable if you are not using schedules, which requires source tables `ticket_schedule`, `daylight_time`, and `time_zone`
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated.

Copy link
Contributor

@fivetran-jamie fivetran-jamie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved with one teeny tiny CHANGELOG thing

CHANGELOG.md Show resolved Hide resolved
CHANGELOG.md Outdated
- New intermediate models have been introduced to streamline both the readability and maintainability:
- `int_zendesk__timezone_daylight`: A utility model that maintains a record of daylight savings adjustments for each time zone.
- [`int_zendesk__timezone_daylight`](https://github.com/fivetran/dbt_zendesk/blob/main/models/intermediate/int_zendesk__timezone_daylight.sql): A utility model that maintains a record of daylight savings adjustments for each time zone.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- [`int_zendesk__timezone_daylight`](https://github.com/fivetran/dbt_zendesk/blob/main/models/intermediate/int_zendesk__timezone_daylight.sql): A utility model that maintains a record of daylight savings adjustments for each time zone.
- [`int_zendesk__timezone_daylight`](https://github.com/fivetran/dbt_zendesk/blob/main/models/utils/int_zendesk__timezone_daylight.sql): A utility model that maintains a record of daylight savings adjustments for each time zone.

unless you wanna move it to intermediate?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ooo good catch. Updated!

@fivetran-catfritz fivetran-catfritz merged commit a9be770 into main Oct 10, 2024
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants