-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
🐛 Source Intercom: Fix conversations incremental pagination slowness #11208
Conversation
/test connector=connectors/source-intercom
|
Hey @bkrausz thank you for this improvement attempt. I'm running the acceptance tests and will go for a first review asap. |
/test connector=connectors/source-intercom
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bkrausz I ran ./gradlew format
: there was an unused import in integration_test.py
.
I also made the has_old_records
an instance attribute instead of a class attribute: this makes the mutation of this variable safer.
You mentioned:
We're sorting by desc. Once we hit the first page with an out-of-date result we can stop.
You removed the ordering related parameters. Is it because the default endpoint behavior is to return descending ordering? Could you please explicitly set the params.update({"order": "desc", "sort": self.cursor_field})
for safety then?
airbyte-integrations/connectors/source-intercom/integration_tests/integration_test.py
Show resolved
Hide resolved
def request_params(self, next_page_token: Mapping[str, Any] = None, **kwargs) -> MutableMapping[str, Any]: | ||
params = super().request_params(next_page_token, **kwargs) | ||
params.update({"order": "asc", "sort": self.cursor_field}) | ||
return params |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guessed you removed this because default ordering is {"order": "desc", "sort": "updated_at"}
. I think you should rather explicitly set params.update({"order": "desc", "sort": self.cursor_field})
for safety in case of API changes and to improve the understanding of the code.
airbyte-integrations/connectors/source-intercom/source_intercom/source.py
Show resolved
Hide resolved
Thanks for the help getting this to a good state! I haven't touched python in many years, so my toolchain and comfort is pretty minimal. |
/test connector=connectors/source-intercom
|
/publish connector=connectors/source-intercom
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the changes! Publishing the new connector version now and will merge afterward.
What
Conversation sync, usually the largest set of objects in Intercom (and the most important for ETL), is incredibly slow, even for incremental sync. The reason for this is that a sync fetches every single page, ignoring the records but still issuing the HTTP requests (other users are also seeing this issue: #9572 (comment)).
How
Sort by descending and once we cross the
updated_at
boundary stop syncing new pages.🚨 User Impact 🚨
Faster syncs
Pre-merge Checklist
Expand the relevant checklist and delete the others.
Community member or Airbyter
airbyte_secret
./gradlew :airbyte-integrations:connectors:<name>:integrationTest
.README.md
bootstrap.md
. See description and examplesdocs/integrations/<source or destination>/<name>.md
including changelog. See changelog exampleAirbyter
If this is a community PR, the Airbyte engineer reviewing this PR is responsible for the below items.
/test connector=connectors/<name>
command is passing/publish
command described hereTests
Unit
Integration
Integration tests will only pass with a specific access token that I don't have access to.
Acceptance
Getting a socket timeout error on
TestBasicRead.test_read
when running these, not sure why, but I believe they are passing.