-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Release/v0.14.0 #122
Release/v0.14.0 #122
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@fivetran-catfritz thanks for working though all of these updates! I pretty much only have comments around the new feature of the property labels. All the other components of the PR look great and work as expected when testing on my end.
Please review my comments and let me know if you have any questions. Thanks!
models/stg_hubspot__property.sql
Outdated
@@ -0,0 +1,40 @@ | |||
{{ config(enabled=var('hubspot_sales_enabled', False)) }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Out of curiosity, how did we align on this being the variable that is needed to enable these models?
Also, is there a reason the default is false? The variable in all other models is true by default. This results in a strange behavior that is in conflict with the rest of the models in the package leveraging this variable.
For example, what if you are only using the contact model and you want to leverage the property labels? Given this logic it will not work because the sales is disabled?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment for property_option
and tmp models
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. The property tables are part of the sales schema, but thinking about it, it should have its own variable. Not everyone has this table, but maybe should not be tied to sales as a whole. I have updated vars, readme and changelog with this. I had initially agreed with the scoping:
Source package
- Add property_option table in (make staging models, seed files, etc)
- This table is are part of the Sales schema, so add {{ config(enabled=var('hubspot_sales_enabled', True)) }} to the staging models(tmp and non-_tmp) and the source config.enabled. We may want to also add a hubspot_property_option_enabled variable (2/3 of customers have the property tables, but something tells me the sales_enabled flag may cover this)
macros/add_property_labels.sql
Outdated
select {{ cte_name }}.* | ||
|
||
{%- if var(passthrough_var_name, []) != [] -%} | ||
{%- set col_list = var(passthrough_var_name) -%} | ||
|
||
{%- for col in col_list -%} -- Create label cols | ||
{%- if col.add_property_label or var('hubspot__enable_all_property_labels', false) -%} | ||
{%- set col_alias = (col.alias | default(col.name)) %} | ||
, {{ col.name }}_option.property_option_label as {{ col_alias }}_label | ||
{% endif -%} | ||
{%- endfor %} | ||
|
||
from {{ cte_name }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated the whitespace control!
), joined as ( | ||
{{ add_property_labels('hubspot__company_pass_through_columns', cte_name='fields') }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is only for fields that are explicitly declared in the passthrough variable, right? What if a customer is using the hubspot__pass_through_all_columns: true
config and having all the property_hs
columns included without needing to explicitly update the hubspot__company_pass_through_columns
variable? Or is that not really a need here?
I could also see that causing the model to be VERY computation expensive since joins and subqueries would be created for every single property_hs
field.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also talked about this in response to your comment on macros/add_property_labels
. But I purposefully opted not to allow this. Also see scoping comments in the ticket, but I'll paste the section that also influenced me:
Other Considerations
This becomes a lot more complicated for people who have hubspot__pass_through_all_columns set to true. The customer who opened the ticket specifically asked for this to be incorporated into the individual hubspot__[table_name]_pass_through_columns vars, so perhaps we should just address this first and then address the other scenario if a customer opens a ticket asking for it. Open to opinions!
macros/add_property_labels.sql
Outdated
left join -- create subset of property and property_options for property in question | ||
(select | ||
property_option.property_option_value, | ||
property_option.property_option_label | ||
from {{ ref('stg_hubspot__property_option') }} as property_option | ||
join {{ ref('stg_hubspot__property') }} as property | ||
on property_option.property_id = property._fivetran_id | ||
and property_option.property_option_label = property.property_label | ||
where property.property_name = '{{ col.name.replace('property_', '') }}' | ||
) as {{ col.name }}_option | ||
|
||
on cast({{ cte_name }}.{{ col_alias }} as {{ dbt.type_string() }}) | ||
= cast({{ col.name }}_option.property_option_value as {{ dbt.type_string() }}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are doing a lot of join logic (along with a subquery) here. I could see this becoming very computationally expensive if a customer utilizes this for all their fields.
I feel it is worth calling out somewhere that this may increase the runtime of the models.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Design wise I cannot think of a more performant approach. For the customers that want this sort of information, this is the best way to do it in my opinion. Although I do appreciate this being disabled by default and an opt in feature for those that truly want it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this was one of the reasons why I opted to have it work only on defined fields. I will add a note to the readme. I had a prior version where I did allow both passthrough all columns and passthrough all labels on, but removed it. There were a few reasons:
- This took a very, very long time to run on I think, for example
company
, and the end results weren't great either. Not all hs_property fields have useful labels, so the result was most columns had null labels. I think if a user needs to bring in labels, it will only be useful for certain columsn. - It was a bit of a nightmare for the staging models that use some of the
hs_cols
as defacto fields, tho this is a lesser issue than above.
README.md
Outdated
For `property_hs_*` columns, you can enable the corresponding, human-readable `property_option`.`label` to be included in the staging models. | ||
- **Note** you cannot bring in labels if using `hubspot__pass_through_all_columns: true`.` | ||
- To enable labels for a given property, set the property attribute `use_property_label: true`, using the below format. | ||
|
||
```yml | ||
vars: | ||
hubspot__ticket_pass_through_columns: | ||
- name: "property_hs_fieldname" | ||
alias: "fieldname" | ||
use_property_label: true | ||
``` | ||
Alternatively, you can enable labels for all passthrough properties by using variable `hubspot__enable_all_property_labels: true`, formatted like the below example. | ||
|
||
```yml | ||
vars: | ||
hubspot__enable_all_property_labels: true | ||
hubspot__ticket_pass_through_columns: | ||
- name: "property_hs_fieldname1" | ||
- name: "property_hs_fieldname2" | ||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we are not allowing this to be used when hubspot__pass_through_all_columns: true
is used then we should call that out here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added.
README.md
Outdated
hubspot__ticket_pass_through_columns: | ||
- name: "property_hs_fieldname" | ||
alias: "fieldname" | ||
use_property_label: true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is inconsistent with the code. In the code you have this as the add_property_label
argument. Which should it be? I imagine it would be easiest to update the documentation to match the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please make the appropriate updates throughout the PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh gosh. Yes, I will update the docs to say add_property_label
.
CHANGELOG.md
Outdated
- When including a passthrough `property_hs_*` column, you now have the option to include the corresponding, human-readable label in the staging models. | ||
- See the [Adding property label section](https://github.com/fivetran/dbt_hubspot_source#adding-property-label) of the README for instructions on how to enable this feature! | ||
- This update applies to models: | ||
- `stg_hubspot__company` | ||
- `stg_hubspot__contact` | ||
- `stg_hubspot__deal` | ||
- `stg_hubspot__ticket` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this really a breaking change? This seems more like a feature. Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I went back and forth on this. I decided to make it breaking since we are adding property
and property_option
, which everyone may not have. I was also worried that since I was adding this macro with joins to the source that it might cause unforeseen issues.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh okay I see what you mean. I would actually request this be folded into the feature updates and then provide a blurb that leveraging this new feature you will require the property tables and if you do not have them then you will need to disable the tables and you cannot leverage this feature.
When I see breaking change I think something from the previous build is now different and can change my downstream transformations. Reading through this I don't get the impression this is a 🚨 Breaking 🚨 change, but rather a feature that will result in a 0.X.0 version bump.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@fivetran-joemarkiewicz Thanks for the all the feedback! Definitely got me thinking. I have responded and pushed new changes. Let me know what you think!
CHANGELOG.md
Outdated
- When including a passthrough `property_hs_*` column, you now have the option to include the corresponding, human-readable label in the staging models. | ||
- See the [Adding property label section](https://github.com/fivetran/dbt_hubspot_source#adding-property-label) of the README for instructions on how to enable this feature! | ||
- This update applies to models: | ||
- `stg_hubspot__company` | ||
- `stg_hubspot__contact` | ||
- `stg_hubspot__deal` | ||
- `stg_hubspot__ticket` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I went back and forth on this. I decided to make it breaking since we are adding property
and property_option
, which everyone may not have. I was also worried that since I was adding this macro with joins to the source that it might cause unforeseen issues.
README.md
Outdated
hubspot__ticket_pass_through_columns: | ||
- name: "property_hs_fieldname" | ||
alias: "fieldname" | ||
use_property_label: true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh gosh. Yes, I will update the docs to say add_property_label
.
README.md
Outdated
For `property_hs_*` columns, you can enable the corresponding, human-readable `property_option`.`label` to be included in the staging models. | ||
- **Note** you cannot bring in labels if using `hubspot__pass_through_all_columns: true`.` | ||
- To enable labels for a given property, set the property attribute `use_property_label: true`, using the below format. | ||
|
||
```yml | ||
vars: | ||
hubspot__ticket_pass_through_columns: | ||
- name: "property_hs_fieldname" | ||
alias: "fieldname" | ||
use_property_label: true | ||
``` | ||
Alternatively, you can enable labels for all passthrough properties by using variable `hubspot__enable_all_property_labels: true`, formatted like the below example. | ||
|
||
```yml | ||
vars: | ||
hubspot__enable_all_property_labels: true | ||
hubspot__ticket_pass_through_columns: | ||
- name: "property_hs_fieldname1" | ||
- name: "property_hs_fieldname2" | ||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added.
macros/add_property_labels.sql
Outdated
select {{ cte_name }}.* | ||
|
||
{%- if var(passthrough_var_name, []) != [] -%} | ||
{%- set col_list = var(passthrough_var_name) -%} | ||
|
||
{%- for col in col_list -%} -- Create label cols | ||
{%- if col.add_property_label or var('hubspot__enable_all_property_labels', false) -%} | ||
{%- set col_alias = (col.alias | default(col.name)) %} | ||
, {{ col.name }}_option.property_option_label as {{ col_alias }}_label | ||
{% endif -%} | ||
{%- endfor %} | ||
|
||
from {{ cte_name }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated the whitespace control!
macros/add_property_labels.sql
Outdated
left join -- create subset of property and property_options for property in question | ||
(select | ||
property_option.property_option_value, | ||
property_option.property_option_label | ||
from {{ ref('stg_hubspot__property_option') }} as property_option | ||
join {{ ref('stg_hubspot__property') }} as property | ||
on property_option.property_id = property._fivetran_id | ||
and property_option.property_option_label = property.property_label | ||
where property.property_name = '{{ col.name.replace('property_', '') }}' | ||
) as {{ col.name }}_option | ||
|
||
on cast({{ cte_name }}.{{ col_alias }} as {{ dbt.type_string() }}) | ||
= cast({{ col.name }}_option.property_option_value as {{ dbt.type_string() }}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this was one of the reasons why I opted to have it work only on defined fields. I will add a note to the readme. I had a prior version where I did allow both passthrough all columns and passthrough all labels on, but removed it. There were a few reasons:
- This took a very, very long time to run on I think, for example
company
, and the end results weren't great either. Not all hs_property fields have useful labels, so the result was most columns had null labels. I think if a user needs to bring in labels, it will only be useful for certain columsn. - It was a bit of a nightmare for the staging models that use some of the
hs_cols
as defacto fields, tho this is a lesser issue than above.
), joined as ( | ||
{{ add_property_labels('hubspot__company_pass_through_columns', cte_name='fields') }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also talked about this in response to your comment on macros/add_property_labels
. But I purposefully opted not to allow this. Also see scoping comments in the ticket, but I'll paste the section that also influenced me:
Other Considerations
This becomes a lot more complicated for people who have hubspot__pass_through_all_columns set to true. The customer who opened the ticket specifically asked for this to be incorporated into the individual hubspot__[table_name]_pass_through_columns vars, so perhaps we should just address this first and then address the other scenario if a customer opens a ticket asking for it. Open to opinions!
models/stg_hubspot__property.sql
Outdated
@@ -0,0 +1,40 @@ | |||
{{ config(enabled=var('hubspot_sales_enabled', False)) }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. The property tables are part of the sales schema, but thinking about it, it should have its own variable. Not everyone has this table, but maybe should not be tied to sales as a whole. I have updated vars, readme and changelog with this. I had initially agreed with the scoping:
Source package
- Add property_option table in (make staging models, seed files, etc)
- This table is are part of the Sales schema, so add {{ config(enabled=var('hubspot_sales_enabled', True)) }} to the staging models(tmp and non-_tmp) and the source config.enabled. We may want to also add a hubspot_property_option_enabled variable (2/3 of customers have the property tables, but something tells me the sales_enabled flag may cover this)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@fivetran-catfritz This looks good to me following the recent updates!! I am happy to approve this, but have one request to change the CHANGELOG to fold the breaking change callout into the feature updates but have a small 🚨 breaking notice for users who do not have the property tables and also it seems the version bump was not updated in the README. Be sure to update that before opening for release review.
Great job on this PR!
CHANGELOG.md
Outdated
- When including a passthrough `property_hs_*` column, you now have the option to include the corresponding, human-readable label in the staging models. | ||
- See the [Adding property label section](https://github.com/fivetran/dbt_hubspot_source#adding-property-label) of the README for instructions on how to enable this feature! | ||
- This update applies to models: | ||
- `stg_hubspot__company` | ||
- `stg_hubspot__contact` | ||
- `stg_hubspot__deal` | ||
- `stg_hubspot__ticket` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh okay I see what you mean. I would actually request this be folded into the feature updates and then provide a blurb that leveraging this new feature you will require the property tables and if you do not have them then you will need to disable the tables and you cannot leverage this feature.
When I see breaking change I think something from the previous build is now different and can change my downstream transformations. Reading through this I don't get the impression this is a 🚨 Breaking 🚨 change, but rather a feature that will result in a 0.X.0 version bump.
@@ -104,6 +104,7 @@ vars: | |||
hubspot_engagement_note_enabled: false | |||
hubspot_engagement_task_enabled: false | |||
hubspot_owner_enabled: false | |||
hubspot_property_enabled: false # Disables property and property_option tables |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't forget to bump the version in the README
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. Updated.
PR Overview
This PR will address the following Issue/Feature:
This PR will result in the following new package version:
Please provide the finalized CHANGELOG entry which details the relevant changes included in this PR:
🚨 Breaking Change 🚨
property_hs_*
column, you now have the option to include the corresponding, human-readable label in the staging models.stg_hubspot__company
stg_hubspot__contact
stg_hubspot__deal
stg_hubspot__ticket
Features
stg_hubspot__property
stg_hubspot__property_option
Bug fixes
remove_duplicate_and_prefix_from_columns
to accommodate incoming custom column names containing characters such as-
or$
that are not permitted. The resulting column name will have these characters removed or replaced in itsstg_*
model.stg_hubspot__ticket
, which was causing compilation issues when passing through all columns.PR Checklist
Basic Validation
Please acknowledge that you have successfully performed the following commands locally:
dbt run (if incremental models are present)Before marking this PR as "ready for review" the following have been applied:
Detailed Validation
Please share any and all of your validation steps:
Additional Comments