Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Source Hubspot: Some incremental CRM objects and engagements #8887

Merged

Conversation

lgomezm
Copy link
Contributor

@lgomezm lgomezm commented Dec 17, 2021

What

Adds a more efficient support for incremental updates on some Hubspot streams:

  • Companies
  • Contacts
  • Deals
  • Engagements

How

For CRM objects, it uses the CRM search endpoints, which allow filtering by different properties. In our case, we filter by the respective "last modified date".

For the engagements case, it uses the Get recent engagements endpoint.

Recommended reading order

  1. client.py
  2. api.py

🚨 User Impact 🚨

Are there any breaking changes? What is the end result perceived by the user? If yes, please merge this PR with the 🚨🚨 emoji so changelog authors can further highlight this if needed.

Pre-merge Checklist

Expand the relevant checklist and delete the others.

New Connector

Community member or Airbyter

  • Community member? Grant edit access to maintainers (instructions)
  • Secrets in the connector's spec are annotated with airbyte_secret
  • Unit & integration tests added and passing. Community members, please provide proof of success locally e.g: screenshot or copy-paste unit, integration, and acceptance test output. To run acceptance tests for a Python connector, follow instructions in the README. For java connectors run ./gradlew :airbyte-integrations:connectors:<name>:integrationTest.
  • Code reviews completed
  • Documentation updated
    • Connector's README.md
    • Connector's bootstrap.md. See description and examples
    • docs/SUMMARY.md
    • docs/integrations/<source or destination>/<name>.md including changelog. See changelog example
    • docs/integrations/README.md
    • airbyte-integrations/builds.md
  • PR name follows PR naming conventions

Airbyter

If this is a community PR, the Airbyte engineer reviewing this PR is responsible for the below items.

  • Create a non-forked branch based on this PR and test the below items on it
  • Build is successful
  • Credentials added to Github CI. Instructions.
  • /test connector=connectors/<name> command is passing.
  • New Connector version released on Dockerhub by running the /publish command described here
  • After the connector is published, connector added to connector index as described here
  • Seed specs have been re-generated by building the platform and committing the changes to the seed spec files, as described here

Updating a connector

Community member or Airbyter

  • Grant edit access to maintainers (instructions)
  • Secrets in the connector's spec are annotated with airbyte_secret
  • Unit & integration tests added and passing. Community members, please provide proof of success locally e.g: screenshot or copy-paste unit, integration, and acceptance test output. To run acceptance tests for a Python connector, follow instructions in the README. For java connectors run ./gradlew :airbyte-integrations:connectors:<name>:integrationTest.
  • Code reviews completed
  • Documentation updated
    • Connector's README.md
    • Connector's bootstrap.md. See description and examples
    • Changelog updated in docs/integrations/<source or destination>/<name>.md including changelog. See changelog example
  • PR name follows PR naming conventions

Airbyter

If this is a community PR, the Airbyte engineer reviewing this PR is responsible for the below items.

  • Create a non-forked branch based on this PR and test the below items on it
  • Build is successful
  • Credentials added to Github CI. Instructions.
  • /test connector=connectors/<name> command is passing.
  • New Connector version released on Dockerhub by running the /publish command described here
  • After the new connector version is published, connector version bumped in the seed directory as described here
  • Seed specs have been re-generated by building the platform and committing the changes to the seed spec files, as described here

Connector Generator

  • Issue acceptance criteria met
  • PR name follows PR naming conventions
  • If adding a new generator, add it to the list of scaffold modules being tested
  • The generator test modules (all connectors with -scaffold in their name) have been updated with the latest scaffold by running ./gradlew :airbyte-integrations:connector-templates:generator:testScaffoldTemplates then checking in your changes
  • Documentation which references the generator is updated as needed.


This change is Reviewable

@github-actions github-actions bot added the area/connectors Connector related issues label Dec 17, 2021
@lgomezm
Copy link
Contributor Author

lgomezm commented Dec 18, 2021

Related issue: #8344

@alafanechere
Copy link
Contributor

Hi @lgomezm thank you very much for your contribution! Feel free to ping us when your PR is ready for review. Please also check our doc about publishing a new connector version. You basically have to bump the connector version in the Dockerfile + airbyte-config/init/src/main/resources/seed/source_definitions.yaml

@lgomezm lgomezm marked this pull request as ready for review December 21, 2021 19:49
@lgomezm
Copy link
Contributor Author

lgomezm commented Dec 21, 2021

Hi @alafanechere. I've just pushed some changes and bumped the version where you indicated. Also moved the PR to "Ready for review". Please let me know if there's anything I should provide on my side.

Copy link
Contributor

@alafanechere alafanechere left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @lgomezm, I did a first quick review. I request changes because our integration tests are not passing (a simple unused import problem for the moment):

> Task :airbyte-integrations:connectors:source-hubspot:flakeCheck FAILED
[python] .venv/bin/python -m flake8 . --config /actions-runner/_work/airbyte/airbyte/tools/python/.flake8
	 ./source_hubspot/client.py:11:1: F401 'source_hubspot.api.CRMObjectStream' imported but unused

Could you please make sure to make the acceptance test pass? I'll also ask for another reviewer because I'm not very familiar with this connector, which has a different implementation than the classic CDK-created connectors.

@lazebnyi I allowed myself to request a review from your side because git-blame showed me you are pretty active on this connector 😄 .

Comment on lines 222 to 231
def _filter_dynamic_fields(self, records: Iterable) -> Iterable:
"""Skip certain fields because they are too dynamic and change every call (timers, etc),
see https://github.com/airbytehq/airbyte/issues/2397
"""
for record in records:
if isinstance(record, Mapping) and "properties" in record:
for key in list(record["properties"].keys()):
if key.startswith("hs_time_in"):
record["properties"].pop(key)
yield record
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As you remove these hs_time_in fields from ignored fields in acceptance-test-config?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We move it back and forth
As for me we don't need to remove data hs_* fields from response
Some customers complained about this

#8055

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doc string of reverted back _filter_dynamic_fields function points on issue #2397
which "says" that we don't need to filter-out hs_time_* fields now
we DO filter-out this fileds on acceptance-test level

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've removed this method again in 42b1f42 and can confirm it unit tests and acceptance tests still pass.

self.associations = associations or self.associations
self._include_archived_only = include_archived_only

def list(self, fields) -> Iterable:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know other classes are already defining a list method but this is a reserved python word.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moreover, this function looks like the one defined in CRMObjectStream. Do you think you could make them share a parent class?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The list function is actually being called when the connector's read command is invoked. I think it's called internally, so even though it does not look good because it's a reserved word, I don't know if we can rename it.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lgomezm I believe in this case, we can use some private naming like _list and rename the main property method first, then change the name in all underlying calls. Is this possible, WDYT?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found where it was being invoked, so I renamed it in 42b1f42. I'm sorry I didn't know much about how these methods connected with the framework.

@lgomezm
Copy link
Contributor Author

lgomezm commented Jan 3, 2022

Hi @alafanechere. I've been updating the code as per your feedback. Here's the acceptance test results:
Screen Shot 2022-01-03 at 3 01 06 PM

Please let me know if you have any other comment.

self.associations = associations or self.associations
self._include_archived_only = include_archived_only

def list(self, fields) -> Iterable:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lgomezm I believe in this case, we can use some private naming like _list and rename the main property method first, then change the name in all underlying calls. Is this possible, WDYT?

Comment on lines 222 to 231
def _filter_dynamic_fields(self, records: Iterable) -> Iterable:
"""Skip certain fields because they are too dynamic and change every call (timers, etc),
see https://github.com/airbytehq/airbyte/issues/2397
"""
for record in records:
if isinstance(record, Mapping) and "properties" in record:
for key in list(record["properties"].keys()):
if key.startswith("hs_time_in"):
record["properties"].pop(key)
yield record
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doc string of reverted back _filter_dynamic_fields function points on issue #2397
which "says" that we don't need to filter-out hs_time_* fields now
we DO filter-out this fileds on acceptance-test level

@lgomezm
Copy link
Contributor Author

lgomezm commented Jan 4, 2022

@alafanechere @grubberr I think I have addressed all your comments so far. Please take a look again and let me know if there's anything else I should update.

@grubberr
Copy link
Contributor

grubberr commented Jan 4, 2022

@lgomezm I have approved from my side but please get approval from rest of team because your change pretty big
Thanks!

@lgomezm
Copy link
Contributor Author

lgomezm commented Jan 5, 2022

@lgomezm I have approved from my side but please get approval from rest of team because your change pretty big Thanks!

@alafanechere @lazebnyi This PR is ready for a second look. Please let me know of any comment!

@lgomezm
Copy link
Contributor Author

lgomezm commented Jan 10, 2022

@alafanechere @lazebnyi Hi guys. This PR was updated a while ago. Please take a look at it when you get a chance.

Copy link
Contributor

@alafanechere alafanechere left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @lgomezm, thank for the changes you made. Could you please:

  • bump the version in this file: airbyte-config/init/src/main/resources/config/STANDARD_SOURCE_DEFINITION/36c891d9-4bd9-43ac-bad2-10e12756272c.json ?
  • Fix the conflict with master
    Once this is done and if acceptance tests are passing I'll take care of publishing the connector and merging the branch.
    It'd be great if you could give me edit permission on this branch because I need to update the source_specs.yaml file after I publish the connector (by running ./gradlew airbyte-config:init:processResources).

@lgomezm lgomezm force-pushed the lgomez/hubspot_incremental_crm_objects branch from 42b1f42 to 0c78f6d Compare January 11, 2022 15:00
@lgomezm
Copy link
Contributor Author

lgomezm commented Jan 11, 2022

Hi @lgomezm, thank for the changes you made. Could you please:

  • bump the version in this file: airbyte-config/init/src/main/resources/config/STANDARD_SOURCE_DEFINITION/36c891d9-4bd9-43ac-bad2-10e12756272c.json ?
  • Fix the conflict with master
    Once this is done and if acceptance tests are passing I'll take care of publishing the connector and merging the branch.
    It'd be great if you could give me edit permission on this branch because I need to update the source_specs.yaml file after I publish the connector (by running ./gradlew airbyte-config:init:processResources).

Hi @alafanechere I've just bumped the version as you mentioned, as well as fixed the conflicts with master. I'll try to figure out your branch permission request, because I am not the owner of the forked repo.

@lgomezm
Copy link
Contributor Author

lgomezm commented Jan 11, 2022

@alafanechere BTW, here are the acceptance tests still passing:
Screen Shot 2022-01-11 at 9 59 41 AM

This was referenced Jan 11, 2022
@alafanechere alafanechere temporarily deployed to more-secrets January 11, 2022 16:09 Inactive
@@ -13,7 +13,9 @@
CampaignStream,
ContactListStream,
ContactsListMembershipsStream,
CRMSearchStream,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duplicate import. Please run ./gradlew format

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 719b678.

@octavia-squidington-iii octavia-squidington-iii temporarily deployed to more-secrets January 11, 2022 16:10 Inactive
@marcosmarxm
Copy link
Member

@lgomezm the form_submission stream was added this week but was added as an empty stream. Even after updating the acceptance test yaml file is failing? Sorry to ask again but looks there are some conflicts?

@lgomezm
Copy link
Contributor Author

lgomezm commented Jan 14, 2022

@marcosmarxm I think I've found and fixed the issue with acceptance tests in 9131635. Please give it a try again.

@lgomezm lgomezm force-pushed the lgomez/hubspot_incremental_crm_objects branch from 9131635 to a10db31 Compare January 14, 2022 19:03
@alafanechere alafanechere temporarily deployed to more-secrets January 17, 2022 10:05 Inactive
@octavia-squidington-iii octavia-squidington-iii temporarily deployed to more-secrets January 17, 2022 10:07 Inactive
@octavia-squidington-iii octavia-squidington-iii temporarily deployed to more-secrets January 17, 2022 10:26 Inactive
@alafanechere alafanechere temporarily deployed to more-secrets January 17, 2022 10:39 Inactive
Copy link
Contributor

@alafanechere alafanechere left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We finally made it @lgomezm, thank you for your contribution and patience!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/connectors Connector related issues area/documentation Improvements or additions to documentation community
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants