-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Google Analytics source: expose “isDataGolden” flag #12013
Comments
When using "incremental" sync mode, it would seem this attribute is a make or break aspect that should be used by the connector's logic along with its cursor field?
If the data from the report is not declared golden by google API yet, then the associated rows should still keep replicating during the following incremental syncs until it is flagged "golden" (no new updates required). Exposing the |
@ChristopheDuong just to clarify - would it be enough to extend the schemas and records with a new field within this issue? Or should we go further and make a compound cursor field? Or a compound primary key? or both? |
|
@davydov-d to echo chris' point, when The reason relates to the definition of that flag. From the google docs:
If the value is false for a report, this means the result of the report will be updated in the future, and we should therefore resync this date period. |
@sherifnada I got the point, thanks The problem is |
never mind, that's not a problem since we can have a more complex structure of the stream state |
@ThaliaBarrera do you know how long the |
I've already started implementing the second option and want to add it's gonna be much more complicated solution since python CDK does not support compound cursor fields, only nested - that's challenging At the same time it looks like syncing data for 2 previous days does not require manipulations with the cursor field |
@davydov-d I think 2 days is probably the right lookback window. These docs shared by Thalia indicate data processing time is 24-48hours. So we should probably go for the following solution:
|
* #12013 source GA to Beta: always sync data from two days ago * #12013 GA to Beta: fix changelog * #12013 source GA to Beta: rm odd file * #12013 Source GA to Beta: comment out integration tests * #12013 expose isDataGolden field, assume missing field equals False * #12013 expose isDataGOlden flag: reword docs * auto-bump connector version Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com>
* #12013 source GA to Beta: always sync data from two days ago * #12013 GA to Beta: fix changelog * #12013 source GA to Beta: rm odd file * #12013 Source GA to Beta: comment out integration tests * #12013 expose isDataGolden field, assume missing field equals False * #12013 expose isDataGOlden flag: reword docs * auto-bump connector version Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com>
Tell us about the problem you're trying to solve
Google Analytics Reporting API v4 may return provisional or incomplete data – usually when it’s fresh. When this occurs, the returned data will set the flag “isDataGolden” to false, and the connector will log a warning to the sync log.
Having a warning in the logs acts as a heads-up, but it doesn't help to filter out not-golden data for analysis. Having the flag replicated to the destination would be more helpful.
Describe the solution you’d like
I'd like the “isDataGolden” flag to be replicated to the destination by the GA connector.
Are you willing to submit a PR?
Yes
The text was updated successfully, but these errors were encountered: