Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discuss changing default sync mode #15818

Closed
timroes opened this issue Aug 20, 2022 · 4 comments · Fixed by #20126
Closed

Discuss changing default sync mode #15818

timroes opened this issue Aug 20, 2022 · 4 comments · Fixed by #20126
Assignees
Labels

Comments

@timroes
Copy link
Collaborator

timroes commented Aug 20, 2022

We currently set a default sync mode as we discussed in April which is documented in #9625 (comment). We've now seen and heard from user feedback, that what we outlined that time, might not be the ideal solution. Thus I wanted to discuss two potential changes to the default sync mode we're selecting for new connections:

Change 1

We currently prefer using "incremental, append_dedupe" if the source has a source defined cursor. We though don't check if it has a source defined primary key. Thus this can become the default mode and users will still require to select primary keys for those streams manually. Since our original intend was to make the setup as low friction as possible, I'd suggest that we only select this as default, if the source has a source defined cursor AND a source defined primary key.

Pros:

  • Less friction with the setup, i.e. if we select a sync mode all fields will already be selected, and we'd not force the user to have any manual field selection

Cons:

  • We give the user less often the hint that a stream could be incremental (and they'd "just" need to select a primary key to enable it)

My opinion: I think we should make sure we only select incremental, append_dedupe if we have all fields by the source defined so the user doesn't need to configure anything manually.

Change 2

We currently prefer "full_refresh, overwrite" over "incremental, append", and since full_refresh, overwrite should be supported by all connections, "incremental, append" will effectively never be selected. We could select this over "full_refresh, overwrite" if we have a source defined cursor.

Pros:

  • We'd give users easier incremental data (e.g. the google analytics source would atm fall in this category).

Cons:

  • incremental, append has a change of creating duplicates (without dedupe). This might be a dangerous default behavior, since it might not entirely clear to the user, that our default setting might leave them with duplicate data.

@andyjih @sherifnada @nataliekwong Would be happy for your input on that.

@timroes timroes added discuss area/frontend Related to the Airbyte webapp ui/connection labels Aug 20, 2022
@octavia-squidington-iii
Copy link
Collaborator

cc @airbytehq/frontend

@nataliekwong
Copy link
Contributor

I agree with change 1.

I don't agree with change 2, as users expect their data to be mirrored from the source as a table stake. If we introduce duplicates without their explicit opt-in, it could cause users to lose trust in the data moved and assume the product doesn't work.

That being said, the downside of choosing Full Refresh | Overwrite as the default is that credit consumption is increased, which could cause users to feel that Airbyte is too expensive for them without realizing there is a cheaper alternative if they choose a different sync mode. I don't have an easy solution to that yet, but I think the product has to be trusted before the uses reaches a pricing consideration.

@edmundito
Copy link
Contributor

cc @andyjih

@andyjih
Copy link
Contributor

andyjih commented Sep 27, 2022

I agree with Natalie. Change 1 sounds good, change 2 might surprise users since it doesn't immediately match what's in their source.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants