Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update/crm api v3 - contact merge audit removed #98

Merged
merged 17 commits into from
Mar 28, 2023
19 changes: 19 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,22 @@
# dbt_hubspot_source v0.9.0

## 🚨 Breaking Changes 🚨
In [November 2022](https://fivetran.com/docs/applications/hubspot/changelog#november2022), the Fivetran Hubspot connector switched to v3 of the Hubspot CRM API, which deprecated the `CONTACT_MERGE_AUDIT` table and stored merged contacts in a field in the `CONTACT` table. **This has not been rolled out to BigQuery warehouses yet.** BigQuery connectors with the `CONTACT_MERGE_AUDIT` table enabled will continue to sync this table until the new `CONTACT.property_hs_calculated_merged_vids` field and API version becomes available to them.

This release introduces breaking changes around how contacts are merged in order to align with the above connector changes. It is, however, backwards-compatible.

[PR #98](https://github.com/fivetran/dbt_hubspot_source/pull/98) applies the following changes:
- Updates logic around the recently deprecated `CONTACT_MERGE_AUDIT` table.
- The package now brings in the new `property_hs_calculated_merged_vids` field (and removes the `property_hs_` prefix) for all customers, including those on BigQuery (the field will just be `null`).
- **Backwards-compatibility:** the package will only reference the old `CONTACT_MERGE_AUDIT` table and create `stg_hubspot__contact_merge_audit` if `hubspot_contact_merge_audit_enabled` is explicitly set to `true` in your root `dbt_project.yml` file.

## Under the Hood
[PR #98](https://github.com/fivetran/dbt_hubspot_source/pull/98) applies the following changes:
- Updates seed data to test new merging paradigm.
- Ensures that all timestamp fields are explicitly cast as timestamps without timezone, as recent API changes also introduced inconsistent timestamp formats.

See the transform package [CHANEGLOG](https://github.com/fivetran/dbt_hubspot/blob/main/CHANGELOG.md) for updates made to end models in `dbt_hubspot v0.9.0`.

# dbt_hubspot_source v0.8.0

## 🚨 Breaking Changes 🚨:
Expand Down
10 changes: 5 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -38,13 +38,13 @@ dispatch:
search_order: ['spark_utils', 'dbt_utils']
```

## Step 2: Install the package
## Step 2: Install the package (skip if also using the `hubspot` transformation package)
Include the following hubspot_source package version in your `packages.yml` file.
> TIP: Check [dbt Hub](https://hub.getdbt.com/) for the latest installation instructions or [read the dbt docs](https://docs.getdbt.com/docs/package-management) for more information on installing packages.
```yaml
packages:
- package: fivetran/hubspot_source
version: [">=0.7.0", "<0.8.0"]
version: [">=0.9.0", "<0.10.0"]
```
## Step 3: Define database and schema variables
By default, this package runs using your destination and the `hubspot` schema. If this is not where your HubSpot data is (for example, if your HubSpot schema is named `hubspot_fivetran`), add the following configuration to your root `dbt_project.yml` file:
Expand All @@ -57,7 +57,6 @@ vars:
## Step 4: Disable models for non-existent sources
When setting up your Hubspot connection in Fivetran, it is possible that not every table this package expects will be synced. This can occur because you either don't use that functionality in Hubspot or have actively decided to not sync some tables. Therefore we have added enable/disable configs in the `src.yml` to allow you to disable certain sources not present. Downstream models are automatically disabled as well. In order to disable the relevant functionality in the package, you will need to add the relevant variables in your root `dbt_project.yml`. By default, all variables are assumed to be `true` (with exception of `hubspot_service_enabled`, `hubspot_ticket_deal_enabled`, and `hubspot_contact_merge_audit_enabled`). You only need to add variables for the tables different from default:


```yml
# dbt_project.yml
vars:
Expand All @@ -82,8 +81,9 @@ vars:
hubspot_email_event_spam_report_enabled: false
hubspot_email_event_status_change_enabled: false

hubspot_contact_merge_audit_enabled: true # Enables contact merge auditing to be applied to final models (removes any merged contacts that are still persisting in the contact table)

hubspot_contact_merge_audit_enabled: true # Enables the use of the CONTACT_MERGE_AUDIT table (deprecated by Hubspot v3 API) for removing merged contacts in the final models.
# If false, ~~~contacts will still be merged~~~, but using the CONTACT.property_hs_calculated_merged_vids field (introduced in v3 of the Hubspot CRM API)
# Default = false
# Sales

hubspot_sales_enabled: false # Disables all sales models
Expand Down
2 changes: 1 addition & 1 deletion dbt_project.yml
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
name: 'hubspot_source'
version: '0.8.0'
version: '0.9.0'
config-version: 2
require-dbt-version: [">=1.3.0", "<2.0.0"]
models:
Expand Down
2 changes: 1 addition & 1 deletion docs/catalog.json

Large diffs are not rendered by default.

4 changes: 2 additions & 2 deletions docs/index.html

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion docs/manifest.json

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion docs/run_results.json

Large diffs are not rendered by default.

10 changes: 5 additions & 5 deletions integration_tests/ci/sample.profiles.yml
Original file line number Diff line number Diff line change
Expand Up @@ -16,13 +16,13 @@ integration_tests:
pass: "{{ env_var('CI_REDSHIFT_DBT_PASS') }}"
dbname: "{{ env_var('CI_REDSHIFT_DBT_DBNAME') }}"
port: 5439
schema: hubspot_source_integration_tests_1
schema: hubspot_source_integration_tests_2
threads: 8
bigquery:
type: bigquery
method: service-account-json
project: 'dbt-package-testing'
schema: hubspot_source_integration_tests_1
schema: hubspot_source_integration_tests_2
threads: 8
keyfile_json: "{{ env_var('GCLOUD_SERVICE_KEY') | as_native }}"
snowflake:
Expand All @@ -33,7 +33,7 @@ integration_tests:
role: "{{ env_var('CI_SNOWFLAKE_DBT_ROLE') }}"
database: "{{ env_var('CI_SNOWFLAKE_DBT_DATABASE') }}"
warehouse: "{{ env_var('CI_SNOWFLAKE_DBT_WAREHOUSE') }}"
schema: hubspot_source_integration_tests_1
schema: hubspot_source_integration_tests_2
threads: 8
postgres:
type: postgres
Expand All @@ -42,13 +42,13 @@ integration_tests:
pass: "{{ env_var('CI_POSTGRES_DBT_PASS') }}"
dbname: "{{ env_var('CI_POSTGRES_DBT_DBNAME') }}"
port: 5432
schema: hubspot_source_integration_tests_1
schema: hubspot_source_integration_tests_2
threads: 8
databricks:
catalog: null
host: "{{ env_var('CI_DATABRICKS_DBT_HOST') }}"
http_path: "{{ env_var('CI_DATABRICKS_DBT_HTTP_PATH') }}"
schema: hubspot_source_integration_tests_1
schema: hubspot_source_integration_tests_2
threads: 2
token: "{{ env_var('CI_DATABRICKS_DBT_TOKEN') }}"
type: databricks
62 changes: 37 additions & 25 deletions integration_tests/dbt_project.yml
Original file line number Diff line number Diff line change
@@ -1,12 +1,12 @@
name: 'hubspot_source_integration_tests'
version: '0.8.0'
version: '0.9.0'
profile: 'integration_tests'
config-version: 2
models:
hubspot_source:
+schema:
vars:
hubspot_schema: hubspot_source_integration_tests_1
hubspot_schema: hubspot_source_integration_tests_2
hubspot_source:
hubspot_service_enabled: true
hubspot_company_property_history_identifier: "company_property_history_data"
Expand Down Expand Up @@ -59,62 +59,67 @@ seeds:
+quote_columns: "{{ true if target.type == 'redshift' else false }}"
owner_data:
+column_types:
owner_id: "{{ 'int64' if target.name == 'bigquery' else 'bigint' }}"
owner_id: "{{ 'int64' if target.type == 'bigquery' else 'bigint' }}"
company_data:
+column_types:
id: "{{ 'int64' if target.name == 'bigquery' else 'bigint' }}"
id: "{{ 'int64' if target.type == 'bigquery' else 'bigint' }}"
deal_data:
+column_types:
deal_id: "{{ 'int64' if target.name == 'bigquery' else 'bigint' }}"
owner_id: "{{ 'int64' if target.name == 'bigquery' else 'bigint' }}"
deal_id: "{{ 'int64' if target.type == 'bigquery' else 'bigint' }}"
owner_id: "{{ 'int64' if target.type == 'bigquery' else 'bigint' }}"
_fivetran_synced: timestamp
property_closedate: timestamp
property_createdate: timestamp
deal_contact_data:
+column_types:
contact_id: "{{ 'int64' if target.name == 'bigquery' else 'bigint' }}"
deal_id: "{{ 'int64' if target.name == 'bigquery' else 'bigint' }}"
contact_id: "{{ 'int64' if target.type == 'bigquery' else 'bigint' }}"
deal_id: "{{ 'int64' if target.type == 'bigquery' else 'bigint' }}"
deal_stage_data:
+column_types:
deal_id: "{{ 'int64' if target.name == 'bigquery' else 'bigint' }}"
deal_id: "{{ 'int64' if target.type == 'bigquery' else 'bigint' }}"
company_property_history_data:
+column_types:
company_id: "{{ 'int64' if target.name == 'bigquery' else 'bigint' }}"
company_id: "{{ 'int64' if target.type == 'bigquery' else 'bigint' }}"
email_campaign_data:
+column_types:
id: "{{ 'int64' if target.name == 'bigquery' else 'bigint' }}"
content_id: "{{ 'int64' if target.name == 'bigquery' else 'bigint' }}"
id: "{{ 'int64' if target.type == 'bigquery' else 'bigint' }}"
content_id: "{{ 'int64' if target.type == 'bigquery' else 'bigint' }}"
deal_property_history_data:
+column_types:
deal_id: "{{ 'int64' if target.name == 'bigquery' else 'bigint' }}"
deal_id: "{{ 'int64' if target.type == 'bigquery' else 'bigint' }}"
engagement_call_data:
+column_types:
engagement_id: "{{ 'int64' if target.name == 'bigquery' else 'bigint' }}"
engagement_id: "{{ 'int64' if target.type == 'bigquery' else 'bigint' }}"
engagement_company_data:
+column_types:
engagement_id: "{{ 'int64' if target.name == 'bigquery' else 'bigint' }}"
company_id: "{{ 'int64' if target.name == 'bigquery' else 'bigint' }}"
engagement_id: "{{ 'int64' if target.type == 'bigquery' else 'bigint' }}"
company_id: "{{ 'int64' if target.type == 'bigquery' else 'bigint' }}"
engagement_contact_data:
+column_types:
engagement_id: "{{ 'int64' if target.name == 'bigquery' else 'bigint' }}"
contact_id: "{{ 'int64' if target.name == 'bigquery' else 'bigint' }}"
engagement_id: "{{ 'int64' if target.type == 'bigquery' else 'bigint' }}"
contact_id: "{{ 'int64' if target.type == 'bigquery' else 'bigint' }}"
engagement_deal_data:
+column_types:
engagement_id: "{{ 'int64' if target.name == 'bigquery' else 'bigint' }}"
deal_id: "{{ 'int64' if target.name == 'bigquery' else 'bigint' }}"
engagement_id: "{{ 'int64' if target.type == 'bigquery' else 'bigint' }}"
deal_id: "{{ 'int64' if target.type == 'bigquery' else 'bigint' }}"
engagement_data:
+column_types:
id: "{{ 'int64' if target.name == 'bigquery' else 'bigint' }}"
id: "{{ 'int64' if target.type == 'bigquery' else 'bigint' }}"
engagement_meeting_data:
+column_types:
engagement_id: "{{ 'int64' if target.name == 'bigquery' else 'bigint' }}"
engagement_id: "{{ 'int64' if target.type == 'bigquery' else 'bigint' }}"
engagement_email_data:
+column_types:
engagement_id: "{{ 'int64' if target.name == 'bigquery' else 'bigint' }}"
engagement_id: "{{ 'int64' if target.type == 'bigquery' else 'bigint' }}"
_fivetran_synced: timestamp
email_send_event_id_created: timestamp
engagement_task_data:
+column_types:
engagement_id: "{{ 'int64' if target.name == 'bigquery' else 'bigint' }}"
engagement_id: "{{ 'int64' if target.type == 'bigquery' else 'bigint' }}"
completion_date: "{{ 'varchar(100)' if target.type in ('redshift','postgres') else 'string'}}"
engagement_note_data:
+column_types:
engagement_id: "{{ 'int64' if target.name == 'bigquery' else 'bigint' }}"
engagement_id: "{{ 'int64' if target.type == 'bigquery' else 'bigint' }}"
deal_pipeline_data:
+column_types:
pipeline_id: "{{ 'varchar(100)' if target.type in ('redshift','postgres') else 'string'}}"
Expand All @@ -137,6 +142,13 @@ seeds:
+enabled: "{{ true if target.type != 'postgres' else false }}"
contact_list_data_postgres:
+enabled: "{{ true if target.type == 'postgres' else false }}"
email_event_data:
+column_types:
_fivetran_synced: timestamp
caused_by_created: timestamp
created: timestamp
obsoleted_by_created: timestamp
sent_by_created: timestamp

dispatch:
- macro_namespace: dbt_utils
Expand Down
Loading