Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[google-ads] Improve logging to identify why syncs are getting stuck #20219

Closed
brianjlai opened this issue Dec 8, 2022 · 2 comments · Fixed by #20755
Closed

[google-ads] Improve logging to identify why syncs are getting stuck #20219

brianjlai opened this issue Dec 8, 2022 · 2 comments · Fixed by #20755

Comments

@brianjlai
Copy link
Contributor

Tell us about the problem you're trying to solve

In the past couple of months, there has been an intermittent issue where occasional syncs to the google-ads source would get stuck and be unable to make progress. We have had a couple different OC issues get filed in response and during those investigations, the platform team investigated and identified that the worker pods appeared to be in a healthy state. The connectors teams have done some cursory investigations through the code to see if there was any places where it may be getting stuck. This didn't yield any obvious results. The immediate mitigation is to restart the sync for customers, but thinking longer term we should be trying to get a better sense for what might be happening in the connector and why it is no longer emitting records to the destination.

OC:
974
1148

Describe the solution you’d like

Although not an immediate fix for the problem, we should start adding log statements around various parts of the google-ads code to see where things might not behave as we expect. Right now without any logs visible during a stuck connector sync, there is very little we can do to investigate the root cause. On the latest issue, the stream that was not making progress was the ad_group_ads and user_location_report

A few initial places that might be worth adding some logging of dates or non-PII data:

  • streams.py.IncrementalGoogleAdsStream.read_records() - the infinite while loop until we get an exception seems a little supicious
  • streams.py.IncrementalGoogleAdsStream.stream_slices() - might be worth seeing what slices we're iterating over
  • streams.py.IncrementalGoogleAdsStream.chunk_date_range() - the logic for how we make slices

Once we have the logs in place, if this comes up again we should hopefully be able to figure out what is causing the connector to get stuck or demonstrate that the connector is functioning as expected and it could be another issue.

@sherifnada
Copy link
Contributor

@davydov-d @brianjlai fwiw this sounds like resource starvation. Did we already check for that? (the container is not running out of CPU/Memory)

@davydov-d
Copy link
Collaborator

@sherifnada I could not reproduce this issue locally. Is there a way a GL engineer (i.e. me) can perform this check in cloud?
cc @girarda as current OC eng

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants