Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🎉Source Hubspot: Add contacts associations to Deals stream. #5693

Conversation

vladimir-remar
Copy link
Contributor

@vladimir-remar vladimir-remar commented Aug 27, 2021

What

Add the contacts association

How

By adding to Deals stream the contacts associations.

Pre-merge Checklist

Community member or Airbyter

  • Grant edit access to maintainers (instructions)
  • Secrets in the connector's spec are annotated with airbyte_secret
  • Unit & integration tests added and passing. Community members, please provide proof of success locally e.g: screenshot or copy-paste unit, integration, and acceptance test output. To run acceptance tests for a Python connector, follow instructions in the README. For java connectors run ./gradlew :airbyte-integrations:connectors:<name>:integrationTest.
  • Code reviews completed
  • Documentation updated
    • Connector's README.md
    • Changelog updated in docs/integrations/<source or destination>/<name>.md including changelog. See changelog example
  • PR name follows PR naming conventions
  • Connector version bumped like described here

Airbyter

If this is a community PR, the Airbyte engineer reviewing this PR is responsible for the below items.

  • Create a non-forked branch based on this PR and test the below items on it
  • Build is successful
  • Credentials added to Github CI. Instructions.
  • /test connector=connectors/<name> command is passing.
  • New Connector version released on Dockerhub by running the /publish command described here

@github-actions github-actions bot added the area/connectors Connector related issues label Aug 27, 2021
@vladimir-remar
Copy link
Contributor Author

@marcosmarxm Hello Marc, will you look at this??

@vladimir-remar vladimir-remar changed the title Source Hubspot: Add deals_to_contact_associations stream 🎉 Source Hubspot: Add deals_to_contact_associations stream Aug 27, 2021
@vladimir-remar vladimir-remar changed the title 🎉 Source Hubspot: Add deals_to_contact_associations stream Source Hubspot: Add deals_to_contact_associations stream Aug 27, 2021
@vladimir-remar vladimir-remar changed the title Source Hubspot: Add deals_to_contact_associations stream 🎉Source Hubspot: Add deals_to_contact_associations stream Aug 27, 2021
"to": {
"type": ["null", "array"],
"items": {
"type": ["null", "Object"],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"type": ["null", "Object"],
"type": ["null", "object"],

@@ -14,5 +14,5 @@ RUN pip install .

ENV AIRBYTE_ENTRYPOINT "/airbyte/base.sh"

LABEL io.airbyte.version=0.1.10
LABEL io.airbyte.version=0.1.11
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to bump to 0.1.12 also get latest code from master because yesterday was merged a change to Hubspot connector.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vladimir-remar please bump the versions in all required places, to the 0.1.12

@marcosmarxm marcosmarxm self-assigned this Aug 27, 2021
@marcosmarxm
Copy link
Member

thanks @vladimir-remar I made a few suggestions and request the review!

Copy link
Collaborator

@bazarnov bazarnov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great addition to the connector, added comments and questions around reusing existing parts of code. Looks promising!

self._endpoint = endpoint

def _transform(self, records: Iterable) -> Iterable:
"""Preprocess record """
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add more description in docstring, why we need this method just passing the records.

yield record

def _filter_old_records(self, records: Iterable) -> Iterable:
"""Skip """
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as comment above.

yield record


class DealToContactAssociationsStream(CRMAssociationStream):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is useful part of the connector, however I have a question around reusing the parts:

  • we have class CRMObjectStream(Stream) which should provide the necessary functionality for the flat associations for other HubSpot objects, have you tried to reuse it for your stream? Were there any problems of using that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello @bazarnov, thanks for the comment and the advices, you are right the class CRMObjectStream(Stream) work more than fine, it should be something like this:

class DealToContactAssociationsStream(CRMObjectStream):
  
  entity = "deal"
  associations = ["contacts"]

Now my question is, how to do a right json schema, since it's almost the same as the deals schema, is it good to duplicate it?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can always go with

$ref: schema.json

Inside of your_stream_schema.json
But if you want you can duplicate it, not a big deal.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello @bazarnov, thanks for the comment and the advices, you are right the class CRMObjectStream(Stream) work more than fine, it should be something like this:

class DealToContactAssociationsStream(CRMObjectStream):

  

  entity = "deal"

  associations = ["contacts"]

Now my question is, how to do a right json schema, since it's almost the same as the deals schema, is it good to duplicate it?

Sounds great, can wee proceed with reusing those parts, instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I'm gonna update the pr with the new changes.


def _filter_old_records(self, records: Iterable) -> Iterable:
"""Skip """
for record in records:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe you can simply:
yield from records in this case, no need for the loop.


def _transform(self, records: Iterable) -> Iterable:
"""Preprocess record """
for record in records:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yield from records in this case, no need for the loop.

@@ -14,5 +14,5 @@ RUN pip install .

ENV AIRBYTE_ENTRYPOINT "/airbyte/base.sh"

LABEL io.airbyte.version=0.1.10
LABEL io.airbyte.version=0.1.11
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vladimir-remar please bump the versions in all required places, to the 0.1.12

@bazarnov
Copy link
Collaborator

Thanks for fast push! Please, make sure your local acceptance-tests are passing as expected

@vladimir-remar
Copy link
Contributor Author

Thanks for fast push! Please, make sure your local acceptance-tests are passing as expected

Well, It seems something is wrong with the tests, I attach the logs.

This command work as expected
python3 -m pytest integration_tests
integration-test.log

These ones not
python -m pytest -p integration_tests.acceptance
acceptance-tests-env.log

./acceptance-test-docker.sh
acceptance-tests-docker.log

@bazarnov
Copy link
Collaborator

bazarnov commented Sep 1, 2021

According to the logs, you need to increase the timeout_seconds to larger value.
Please try to replace the content of acceptance-test-config.yml with the following:

connector_image: airbyte/source-hubspot:dev
tests:
  spec:
    - spec_path: "source_hubspot/spec.json"
  connection:
    - config_path: "secrets/config.json"
      status: "succeed"
    - config_path: "integration_tests/invalid_config.json"
      status: "failed"
#  discovery: fixme (eugene): contacts schema does not match
#    - config_path: "secrets/config.json"
  basic_read:
    - config_path: "secrets/config.json"
      # TODO: permissions error with Workflows stream for Test Account
      configured_catalog_path: "sample_files/configured_catalog_without_workflows.json"
      timeout_seconds: 3600
#  incremental: fixme (eugene): '<=' not supported between instances of 'int' and 'str'
#    - config_path: "secrets/config.json"
#      configured_catalog_path: "sample_files/configured_catalog.json"
#      future_state_path: "integration_tests/abnormal_state.json"
#      cursor_paths:
#        subscription_changes: ["timestamp"]
#        email_events: ["timestamp"]
  full_refresh:
    - config_path: "secrets/config.json"
      configured_catalog_path: "sample_files/configured_catalog_without_workflows.json"
      timeout_seconds: 3600

After this, please run the following command from Airbyte root:
./gradlew clean :airbyte-integrations:connectors:source-hubspot:integrationTest

If command this is successful:

  1. commit your changes into your branch
  2. make the merge with master to have your branch updated up to the latest master.
  3. push the changes into your branch

If command is failed:
submit the latest logs from the command above.

@vladimir-remar
Copy link
Contributor Author

@bazarnov I did the gradlew but it keeps failing
gradlew.log

@bazarnov
Copy link
Collaborator

bazarnov commented Sep 1, 2021

Make sure:

  • you don't have the infinity loop when you run the python main_dev.py read --config secrets/config.json --catalog sample_files/configured_catalog.json within .venv of the connector.
  • you use the close-to-recent start_date inside of your config.json, like 2021-08-01T00:00:00Z

Copy link
Contributor

@midavadim midavadim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

version

Copy link
Contributor

@midavadim midavadim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

version

@midavadim midavadim self-requested a review September 1, 2021 23:03
Copy link
Contributor

@midavadim midavadim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

version

@marcosmarxm
Copy link
Member

@vladimir-remar sorry to ask again, but you need to bump the version again. After that we're ready to merge this!

@vladimir-remar
Copy link
Contributor Author

I bump the versions.

@marcosmarxm
Copy link
Member

@vladimir-remar can you check your latest commit?

@github-actions github-actions bot removed area/worker Related to worker area/documentation Improvements or additions to documentation area/protocol area/api Related to the api area/frontend normalization area/platform issues related to the platform CDK Connector Development Kit labels Sep 13, 2021
@vladimir-remar
Copy link
Contributor Author

@vladimir-remar can you check your latest commit?
wha do you mean? do you want me to change something?

"""Entity URL"""
return f"/crm/v3/associations/{self._relationship_from}/{self._relationship_to}/{self._endpoint}"

def __init__(self, relationship_from: str = None, relationship_to: str = None, endpoint: str = None, **kwargs):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recommend the following ordering convention:

  • properties
  • init
  • abstract methods
  • concrete methods

@@ -636,6 +636,14 @@ def list(self, fields) -> Iterable:
yield record


class DealToContactAssociationsStream(CRMObjectStream):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vladimir-remar it seems better from a UX standpoint to include the deal-to-contact association as a field on the deals stream. This way in the destination it will be normalized to a deal_contacts table. WDYT?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @sherifnada thanks for the review, about the suggestion make sense for me, I will include the field as "associations", since "contacts" are one of them , for me it would be something like this.

class DealStream(CRMObjectStream):
    """Deals, API v3"""

    def __init__(self, associations: List[str] = None, **kwargs):
        super().__init__(entity="deal", associations=associations, **kwargs)
        self._stage_history = DealStageHistoryStream(**kwargs)

    def list(self, fields) -> Iterable:
        history_by_id = {}
        for record in self._stage_history.list(fields):
            if all(field in record for field in ("id", "dealstage")):
                history_by_id[record["id"]] = record["dealstage"]
        for record in super().list(fields):
            if record.get("id") and int(record["id"]) in history_by_id:
                record["dealstage"] = history_by_id[int(record["id"])]
            yield record

and I will call it in client.py
"deal_to_contact_associations": DealStream(associations=["contacts"], **common_params),

Copy link
Contributor

@sherifnada sherifnada Sep 14, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry I was a little unclear: why wouldn't we include this by default in the Deals stream instead of a separate deals_to_contacts_associations? That would be my first inclination although there may be a reason why that's not favorable.

CRMObjectStream already accepts associations as input so you actually don't need to change anything about DealStream, you only need to change the calling context to initialize it with associations=['contacts']

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello @sherifnada and thanks again.
On the first question, my first approach was to replicate the functionality of the Deals Streams instead of modifying it, since associations were not included by default in the Deals stream, I did it thinking about the possible inclusion of the remaining associations. Why it was not included, I really do not know, perhaps to obtain the values of the associations we do not need the history of the deals or because the rest of the associations do not generate a good output, I do not know the result of the iteration with the rest of the associations Maybe someone on the team can answer that question.

Finally calling CRMObjectStream like this "deal_to_contact_associations": CRMObjectStream(entity="deal", associations=['contacts'], **common_params) would also get the response.

So my question is, at this point what it is the good approach to implement the associations in the current streams?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vladimir-remar that approach (including associations=['contacts']) is the one I'd recommend!

@Phlair Phlair removed their request for review September 14, 2021 10:32
…RMObjectStream to deal_to_contact_associations
Copy link
Contributor

@sherifnada sherifnada left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vladimir-remar can you update the schema for deals to include the contacts association if it's not already there and remove the newly added schema (which is no longer needed if we say associations=['contacts'])? We should be good to go afterwards

and also update the PR title since we're no longer adding a new stream :)

@vladimir-remar vladimir-remar changed the title 🎉Source Hubspot: Add deals_to_contact_associations stream 🎉Source Hubspot: Add deals_to_contact_associations in the client apis Sep 17, 2021
@vladimir-remar vladimir-remar changed the title 🎉Source Hubspot: Add deals_to_contact_associations in the client apis 🎉Source Hubspot: Add contact association to Deals stream. Sep 17, 2021
@vladimir-remar vladimir-remar changed the title 🎉Source Hubspot: Add contact association to Deals stream. 🎉Source Hubspot: Add contacts associations to Deals stream. Sep 17, 2021
@sherifnada sherifnada mentioned this pull request Sep 20, 2021
@jrhizor jrhizor temporarily deployed to more-secrets September 20, 2021 00:52 Inactive
@sherifnada sherifnada merged commit f560ae1 into airbytehq:master Sep 20, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/connectors Connector related issues
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants