Skip to content
This repository has been archived by the owner on Nov 1, 2022. It is now read-only.

Closes #5371 - Introduce 'sync' ping, add 'syncUuid' to all sync-related pings #5386

Merged
merged 1 commit into from
Apr 13, 2020

Conversation

grigoryk
Copy link
Contributor

This patch introduce an 'overarching' sync ping, which is meant to contain information
describing a sync overall, vs individual engine runs. Currently it contains global sync
errors.

syncUuid was also added to all sync-related pings, allowing us to tie together in an
analysis all pings that were emitted as part of a single "sync".


Pull Request checklist

  • Quality: This PR builds and passes detekt/ktlint checks (A pre-push hook is recommended)
  • Tests: This PR includes thorough tests or an explanation of why it does not
  • Changelog: This PR includes a changelog entry or does not need one
  • Accessibility: The code in this PR follows accessibility best practices or does not include any user facing features

After merge

  • Milestone: Make sure issues closed by this pull request are added to the milestone of the version currently in development.
  • Breaking Changes: If this is a breaking change, please push a draft PR on Reference Browser to address the breaking issues.

@codecov
Copy link

codecov bot commented Dec 20, 2019

Codecov Report

Merging #5386 into master will increase coverage by 0.02%.
The diff coverage is 82.50%.

Impacted file tree graph

@@             Coverage Diff              @@
##             master    #5386      +/-   ##
============================================
+ Coverage     77.74%   77.77%   +0.02%     
+ Complexity     4623     4416     -207     
============================================
  Files           614      556      -58     
  Lines         22624    21082    -1542     
  Branches       3328     3062     -266     
============================================
- Hits          17589    16396    -1193     
+ Misses         3678     3417     -261     
+ Partials       1357     1269      -88     
Impacted Files Coverage Δ Complexity Δ
...ponents/service/fxa/sync/WorkManagerSyncManager.kt 0.00% <0.00%> (ø) 0.00 <0.00> (ø)
...nents/service/sync/logins/SyncableLoginsStorage.kt 0.00% <0.00%> (ø) 0.00 <0.00> (ø)
...components/support/sync/telemetry/SyncTelemetry.kt 87.71% <84.61%> (-3.37%) 41.00 <26.00> (+7.00) ⬇️
...ts/support/ktx/android/content/res/AssetManager.kt
...java/mozilla/components/support/ktx/kotlin/Char.kt
...ponents/support/ktx/android/org/json/JSONObject.kt
.../components/browser/menu/item/BrowserMenuSwitch.kt
...nts/browser/menu/item/BrowserMenuCompoundButton.kt
...ozilla/components/support/ktx/android/view/View.kt
... and 76 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 658f7f4...879b8e3. Read the comment docs.

@grigoryk grigoryk added the work in progress Not ready to land yet. Work in progress (WIP). label Dec 20, 2019
Copy link
Member

@linabutler linabutler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks terrific and exactly what I was thinking, thanks so much @grigoryk! The only thing that looks off is the early return in processSyncTelemetry, everything else is great! 🚀

@grigoryk grigoryk force-pushed the issue5371SyncPingUUID branch from 73e17cd to f465730 Compare December 24, 2019 00:24
@grigoryk grigoryk added 🕵️‍♀️ needs review PRs that need to be reviewed and removed work in progress Not ready to land yet. Work in progress (WIP). labels Jan 6, 2020
@csadilek csadilek self-assigned this Jan 6, 2020
Copy link
Contributor

@Dexterp37 Dexterp37 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good overall, I'd recommend going with what @linacambridge suggests and, if possible, adding test coverage.

@grigoryk grigoryk force-pushed the issue5371SyncPingUUID branch 3 times, most recently from 592cf7a to 5542721 Compare April 2, 2020 01:39
@grigoryk grigoryk requested a review from Dexterp37 April 2, 2020 01:39
@grigoryk
Copy link
Contributor Author

grigoryk commented Apr 2, 2020

@Dexterp37 @linacambridge finally got around to finishing this - addressed your feedback and added tests. Tests ended up being a little tricky to write, but not insurmountable.

@grigoryk
Copy link
Contributor Author

grigoryk commented Apr 2, 2020

CI is failing because bintray seems to be having a bad day:
[task 2020-04-02T01:42:22.078Z] > Could not GET 'https://jcenter.bintray.com/org/jetbrains/kotlinx/kotlinx-coroutines-core/1.1.1/kotlinx-coroutines-core-1.1.1.pom'. Received status code 502 from server: Bad Gateway

@grigoryk
Copy link
Contributor Author

grigoryk commented Apr 2, 2020

Request for data collection review form

All questions are mandatory. You must receive review from a data steward peer on your responses to these questions before shipping new data collection.

  1. What questions will you answer with this data?

About sync telemetry overall:
We would like to measure the performance and correctness of our Rust sync implementation. This includes collecting the time taken to sync each data type (currently history and bookmarks), incoming and outgoing record counts, any errors that occur (reporting sanitized error messages in a string field), and, for bookmarks, tree structure problem counts.

About this change:
An addition of the sync ping and syncUuid allows us to associate together individual data type sync pings which were executed together as part of a single "sync". It also allows us to more accurately observe when and how often data synchronization happens, which lets us judge health of the associated schedulers from which we can infer how well sync is behaving from user's point of view.

  1. Why does Mozilla need to answer these questions? Are there benefits for users? Do we need this information to address product or business requirements? Some example responses:

Sync is one of the core features of our products; we need to know that the implementations of it that we're shipping are healthy.

  1. What alternative methods did you consider to answer these questions? Why were they not sufficient?

If we want to fully understand health of the sync implementation we're currently shipping, there really isn't an alternative to this approach. It maps directly to how sync operates. Any alternative that's sufficiently different wouldn't map as directly to the actual implementation (i.e., wouldn't reflect reality as well).

  1. Can current instrumentation answer these questions?
    Sort of; we can see how individual data type syncs are behaving, but we can't quite tell if Sync overall is functioning correctly, since the current telemetry doesn't allow us to tie together pings for different data type syncs which all run together.

  2. List all proposed measurements and indicate the category of data collection for each measurement, using the Firefox data collection categories found on the Mozilla wiki.

Measurement Description Data Collection Category Tracking Bug #
Global sync ping, which includes a global failureReason and a global syncUuid unique to a single Sync Category 1 #5371
history-sync.syncUuid metric, same value as the global syncUuid Category 1 #5371
bookmarks-sync.syncUuid metric, same value as the global syncUuid Category 1 #5371
logins-sync.syncUuid metric, same value as the global syncUuid Category 1 #5371
  1. How long will this data be collected? Choose one of the following:

I want to permanently monitor this data. Grisha Kruglov + application-services team.

  1. What populations will you measure?

All populations for all products which ship relevant a-c components.

  1. If this data collection is default on, what is the opt-out mechanism for users?

Standard Glean SDK mechanism.

  1. Please provide a general description of how you will analyze this data.

Redash/GLAM (when available) dashboards.

  1. Where do you intend to share the results of your analysis?

Internally within sync, application-services, android-components and fenix teams.

  1. Is there a third-party tool (i.e. not Telemetry) that you are proposing to use for this data collection? If so:

No.

@grigoryk grigoryk requested a review from boek April 2, 2020 02:02
@grigoryk grigoryk changed the title Closes #5371 - Introduce 'sync' ping, add 'syncUuid' to all sync-related pings Closes #5371 - Introduce 'sync' ping, add 'syncUuid' to all sync-related pings Apr 2, 2020
@grigoryk grigoryk added the E5 Estimation points: 5 label Apr 2, 2020
Copy link
Contributor

@Dexterp37 Dexterp37 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me from a telemetry point of view. I left a few comments below for things that are still recommended to be changed. Let me know if anything is unclear.

@liuche
Copy link
Contributor

liuche commented Apr 2, 2020

@grigoryk could you be just a little more specific in the review request form, "How Sync implementations are behaving in the wild." is pretty vague. I think there are good explanations sprinkled across the original issue, and in docs, but this form is meant to be a single place to look quickly and understand what this data collection is doing.

@grigoryk grigoryk force-pushed the issue5371SyncPingUUID branch from 5542721 to 8e0e6a3 Compare April 2, 2020 19:03
@grigoryk grigoryk requested a review from Dexterp37 April 2, 2020 19:04
@grigoryk
Copy link
Contributor Author

grigoryk commented Apr 2, 2020

@Dexterp37 thanks for feedback, addressed your comments.

@grigoryk
Copy link
Contributor Author

grigoryk commented Apr 2, 2020

@liuche I've updated the data review form based on #3092. Let me know if this is clearer, or if you'd like to see more detail.

Copy link
Contributor

@csadilek csadilek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ from a-c side so it can land once other reviews come in :).

@liuche
Copy link
Contributor

liuche commented Apr 3, 2020

@boek has a data review in flight, I was just doing some drive-by commenting :P

notification_emails:
- sync-core@mozilla.com
data_reviews:
- TODO
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a reminder to update the link to the data review response after you have one. Here and in the other yaml files.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, waiting for that, hence a TODO :)

@grigoryk grigoryk linked an issue Apr 6, 2020 that may be closed by this pull request
Copy link
Contributor

@boek boek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Data Review Form (to be filled by Data Stewards)

  1. Is there or will there be documentation that describes the schema for the ultimate data set in a public, complete, and accurate way?
    Yes, metrics.yaml and metrics.md

  2. Is there a control mechanism that allows the user to turn the data collection on and off?
    Yes, using the Glean SDK API

  3. If the request is for permanent data collection, is there someone who will monitor the data over time?
    Up to the client.

  4. Using the category system of data types on the Mozilla wiki, what collection type of data do the requested measurements fall under?
    Cat 1

  5. Is the data collection request for default-on or default-off?
    Up to the client.

  6. Does the instrumentation include the addition of any new identifiers (whether anonymous or otherwise; e.g., username, random IDs, etc. See the appendix for more details)?
    A new GUID used to group the separate pings all related to sync data

  7. Is the data collection covered by the existing Firefox privacy notice?
    Yes

  8. Does there need to be a check-in in the future to determine whether to renew the data?
    Up to the client.

  9. Does the data collection use a third-party collection tool?
    no

… all sync-related pings

This patch introduce an 'overarching' sync ping, which is meant to contain information
describing a sync overall, vs individual engine runs. Currently it contains global sync
errors.

syncUuid was also added to all sync-related pings, allowing us to tie together in an
analysis all pings that were emitted as part of a single "sync".
@grigoryk grigoryk force-pushed the issue5371SyncPingUUID branch from 8e0e6a3 to 879b8e3 Compare April 13, 2020 19:42
@grigoryk
Copy link
Contributor Author

bors r=csadilek,Dexterp37,boek

@bors
Copy link

bors bot commented Apr 13, 2020

Build succeeded

@bors bors bot merged commit 48e3b99 into mozilla-mobile:master Apr 13, 2020
@grigoryk grigoryk deleted the issue5371SyncPingUUID branch April 13, 2020 19:58
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
E5 Estimation points: 5 🕵️‍♀️ needs review PRs that need to be reviewed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add global "sync ping", add UUID to all sync pings
6 participants