You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We currently set a default sync mode as we discussed in April which is documented in #9625 (comment). We've now seen and heard from user feedback, that what we outlined that time, might not be the ideal solution. Thus I wanted to discuss two potential changes to the default sync mode we're selecting for new connections:
Change 1
We currently prefer using "incremental, append_dedupe" if the source has a source defined cursor. We though don't check if it has a source defined primary key. Thus this can become the default mode and users will still require to select primary keys for those streams manually. Since our original intend was to make the setup as low friction as possible, I'd suggest that we only select this as default, if the source has a source defined cursor AND a source defined primary key.
Pros:
Less friction with the setup, i.e. if we select a sync mode all fields will already be selected, and we'd not force the user to have any manual field selection
Cons:
We give the user less often the hint that a stream could be incremental (and they'd "just" need to select a primary key to enable it)
My opinion: I think we should make sure we only select incremental, append_dedupe if we have all fields by the source defined so the user doesn't need to configure anything manually.
Change 2
We currently prefer "full_refresh, overwrite" over "incremental, append", and since full_refresh, overwrite should be supported by all connections, "incremental, append" will effectively never be selected. We could select this over "full_refresh, overwrite" if we have a source defined cursor.
Pros:
We'd give users easier incremental data (e.g. the google analytics source would atm fall in this category).
Cons:
incremental, append has a change of creating duplicates (without dedupe). This might be a dangerous default behavior, since it might not entirely clear to the user, that our default setting might leave them with duplicate data.
I don't agree with change 2, as users expect their data to be mirrored from the source as a table stake. If we introduce duplicates without their explicit opt-in, it could cause users to lose trust in the data moved and assume the product doesn't work.
That being said, the downside of choosing Full Refresh | Overwrite as the default is that credit consumption is increased, which could cause users to feel that Airbyte is too expensive for them without realizing there is a cheaper alternative if they choose a different sync mode. I don't have an easy solution to that yet, but I think the product has to be trusted before the uses reaches a pricing consideration.
We currently set a default sync mode as we discussed in April which is documented in #9625 (comment). We've now seen and heard from user feedback, that what we outlined that time, might not be the ideal solution. Thus I wanted to discuss two potential changes to the default sync mode we're selecting for new connections:
Change 1
We currently prefer using "incremental, append_dedupe" if the source has a source defined cursor. We though don't check if it has a source defined primary key. Thus this can become the default mode and users will still require to select primary keys for those streams manually. Since our original intend was to make the setup as low friction as possible, I'd suggest that we only select this as default, if the source has a source defined cursor AND a source defined primary key.
Pros:
Cons:
My opinion: I think we should make sure we only select incremental, append_dedupe if we have all fields by the source defined so the user doesn't need to configure anything manually.
Change 2
We currently prefer "full_refresh, overwrite" over "incremental, append", and since full_refresh, overwrite should be supported by all connections, "incremental, append" will effectively never be selected. We could select this over "full_refresh, overwrite" if we have a source defined cursor.
Pros:
Cons:
incremental, append
has a change of creating duplicates (without dedupe). This might be a dangerous default behavior, since it might not entirely clear to the user, that our default setting might leave them with duplicate data.@andyjih @sherifnada @nataliekwong Would be happy for your input on that.
The text was updated successfully, but these errors were encountered: