Skip to content
This repository has been archived by the owner on Nov 1, 2022. It is now read-only.

Closes #10371: Telemetry for sync engines - credit cards, addresses, tabs #10372

Merged
merged 1 commit into from
May 28, 2021

Conversation

grigoryk
Copy link
Contributor

@grigoryk grigoryk commented May 28, 2021

Fixes #10371

Pull Request checklist

  • Quality: This PR builds and passes detekt/ktlint checks (A pre-push hook is recommended)
  • Tests: This PR includes thorough tests or an explanation of why it does not
  • Changelog: This PR includes a changelog entry or does not need one
  • Accessibility: The code in this PR follows accessibility best practices or does not include any user facing features

After merge

  • Milestone: Make sure issues closed by this pull request are added to the milestone of the version currently in development.
  • Breaking Changes: If this is a breaking change, please push a draft PR on Reference Browser to address the breaking issues.

@codecov
Copy link

codecov bot commented May 28, 2021

Codecov Report

Merging #10372 (5dc0a7c) into master (0474c53) will decrease coverage by 13.25%.
The diff coverage is 5.71%.

Impacted file tree graph

@@              Coverage Diff              @@
##             master   #10372       +/-   ##
=============================================
- Coverage     74.36%   61.10%   -13.26%     
+ Complexity     6314      397     -5917     
=============================================
  Files           843       58      -785     
  Lines         32060     2435    -29625     
  Branches       5309      444     -4865     
=============================================
- Hits          23840     1488    -22352     
+ Misses         5495      751     -4744     
+ Partials       2725      196     -2529     
Impacted Files Coverage Δ
...components/support/sync/telemetry/SyncTelemetry.kt 63.47% <5.71%> (-22.86%) ⬇️
...onents/browser/state/reducer/LocaleStateReducer.kt
...a/mozilla/components/lib/crash/db/CrashDatabase.kt
...onents/browser/state/reducer/EngineStateReducer.kt
.../components/support/ktx/android/content/Context.kt
...ponents/feature/downloads/provider/FileProvider.kt
...java/mozilla/components/lib/crash/CrashReporter.kt
...components/feature/awesomebar/AwesomeBarFeature.kt
...ponents/browser/state/reducer/LastAccessReducer.kt
...feature/addons/update/db/UpdateAttemptsDatabase.kt
... and 776 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0474c53...5dc0a7c. Read the comment docs.

@grigoryk
Copy link
Contributor Author

Copied from #5294 (comment), as this telemetry is covering the same system. I've highlighted my changes to the original request.

Request for data collection review form

  1. What questions will you answer with this data?

We would like to measure the performance and correctness of our Rust sync implementation. This includes collecting the time taken to sync each data type (currently history, bookmarks, passwords, credit cards, tabs and addresses), incoming and outgoing record counts, any errors that occur (reporting sanitized error messages in a string field), and, for bookmarks, tree structure problem counts.

With the exception of the error string, which does not contain PII, we're submitting timings and counts only.

  1. Why does Mozilla need to answer these questions? Are there benefits for users? Do we need this information to address product or business requirements?

We need to understand how our Sync implementation behaves in the wild. Existing Sync telemetry has been valuable in monitoring overall system health and detecting issues

  1. What alternative methods did you consider to answer these questions? Why were they not sufficient?

Server-side metrics are not sufficient to understand Sync performance (especially for each step of an engine sync), given that the bulk of the work happens on clients. Validation data can only be collected on the client, since Sync records are encrypted and opaque to the server. The Sync ping for Desktop provides some stats about Desktop, but, since all three still use different Sync implementations, can't be extrapolated to Fenix.

  1. Can current instrumentation answer these questions?

No existing telemetry coverage for these new engines.

  1. List all proposed measurements and indicate the category of data collection for each measurement, using the Firefox data collection categories on the Mozilla wiki.
Measurement Description Data Collection Category Tracking Bug #
**Timings, counts, and failure reasons for credit card syncs** Interaction data #10371
**Timings, counts, and failure reasons for tabs syncs** Interaction data #10371
**Timings, counts, and failure reasons for addresses syncs** Interaction data #10371
  1. How long will this data be collected? Choose one of the following:

I want to permanently monitor this data. (Grisha Kruglov, on behalf of the sync team)

  1. What populations will you measure?

All users with sync enabled, in products that use service-firefox-accounts and service-sync-logins. Currently, the main consumer of these components is Fenix

The data is not correlated to the client_id; instead, we send a hash of the user's Firefox account ID (uid). This does not expose new identifiers, as these are already submitted in the Sync ping on other platforms.

  • Which release channels?

Presumably, all (see individual product owners for details on their sync integration).

  • Which countries?

Presumably, all (see individual product owners for details on their sync integration).

  • Which locales?

Presumably, all (see individual product owners for details on their sync integration).

  • Any other filters? Please describe in detail below.

Presumably, no (see individual product owners for details on their sync integration).

  1. If this data collection is default on, what is the opt-out mechanism for users?

Users can opt-out by disabling telemetry, or signing out of Sync.

  1. Please provide a general description of how you will analyze this data.

We will expand existing sync health dashboards to monitor these engines.

  1. Where do you intend to share the results of your analysis?

Within sync and mobile teams.

  1. Is there a third-party tool (i.e. not Telemetry) that you are proposing to use for this data collection?

No.

@travis79
Copy link
Member

Copied from #5294 (comment), as this telemetry is covering the same system. I've highlighted my changes to the original request.

Request for data collection review form

1. What questions will you answer with this data?

We would like to measure the performance and correctness of our Rust sync implementation. This includes collecting the time taken to sync each data type (currently history, bookmarks, passwords, credit cards, tabs and addresses), incoming and outgoing record counts, any errors that occur (reporting sanitized error messages in a string field), and, for bookmarks, tree structure problem counts.

With the exception of the error string, which does not contain PII, we're submitting timings and counts only.

1. Why does Mozilla need to answer these questions?  Are there benefits for users? Do we need this information to address product or business requirements?

We need to understand how our Sync implementation behaves in the wild. Existing Sync telemetry has been valuable in monitoring overall system health and detecting issues

1. What alternative methods did you consider to answer these questions? Why were they not sufficient?

Server-side metrics are not sufficient to understand Sync performance (especially for each step of an engine sync), given that the bulk of the work happens on clients. Validation data can only be collected on the client, since Sync records are encrypted and opaque to the server. The Sync ping for Desktop provides some stats about Desktop, but, since all three still use different Sync implementations, can't be extrapolated to Fenix.

1. Can current instrumentation answer these questions?

No existing telemetry coverage for these new engines.

1. List all proposed measurements and indicate the category of data collection for each measurement, using the Firefox [data collection categories](https://wiki.mozilla.org/Firefox/Data_Collection) on the Mozilla wiki.

Measurement Description Data Collection Category Tracking Bug #
Timings, counts, and failure reasons for credit card syncs Interaction data #10371
Timings, counts, and failure reasons for tabs syncs Interaction data #10371
Timings, counts, and failure reasons for addresses syncs Interaction data #10371

1. How long will this data be collected?  Choose one of the following:

I want to permanently monitor this data. (Grisha Kruglov, on behalf of the sync team)

1. What populations will you measure?

All users with sync enabled, in products that use service-firefox-accounts and service-sync-logins. Currently, the main consumer of these components is Fenix

The data is not correlated to the client_id; instead, we send a hash of the user's Firefox account ID (uid). This does not expose new identifiers, as these are already submitted in the Sync ping on other platforms.

* Which release channels?

Presumably, all (see individual product owners for details on their sync integration).

* Which countries?

Presumably, all (see individual product owners for details on their sync integration).

* Which locales?

Presumably, all (see individual product owners for details on their sync integration).

* Any other filters?  Please describe in detail below.

Presumably, no (see individual product owners for details on their sync integration).

1. If this data collection is default on, what is the opt-out mechanism for users?

Users can opt-out by disabling telemetry, or signing out of Sync.

1. Please provide a general description of how you will analyze this data.

We will expand existing sync health dashboards to monitor these engines.

1. Where do you intend to share the results of your analysis?

Within sync and mobile teams.

1. Is there a third-party tool (i.e. not Telemetry) that you are proposing to use for this data collection?

No.

Data Review

  1. Is there or will there be documentation that describes the schema for the ultimate data set in a public, complete, and accurate way?

Yes, through the metrics definition file and the Glean Dictionary.

  1. Is there a control mechanism that allows the user to turn the data collection on and off?

Yes, through the standard telemetry opt-out provided by an app consuming these components, or by signing out of Sync.

  1. If the request is for permanent data collection, is there someone who will monitor the data over time?

Yes, Grisha Kruglov will monitor on behalf of the Sync team.

  1. Using the category system of data types on the Mozilla wiki, what collection type of data do the requested measurements fall under?

Category 2, User Interaction Data

  1. Is the data collection request for default-on or default-off?

Default-on

  1. Does the instrumentation include the addition of any new identifiers?

No new identifiers, and it is noted that this telemetry is not associated with the Glean client_id but instead with the existing hashed FxA identifier.

  1. Is the data collection covered by the existing Firefox privacy notice?

Yes

  1. Does the data collection use a third-party collection tool?

No

Result

data-review+

cc @grigoryk

@grigoryk grigoryk force-pushed the syncTelemetryNew branch from dccfab0 to aa8860d Compare May 28, 2021 17:33
@grigoryk grigoryk force-pushed the syncTelemetryNew branch from aa8860d to 5dc0a7c Compare May 28, 2021 17:33
@grigoryk grigoryk added the 🛬 needs landing PRs that are ready to land label May 28, 2021
@mergify mergify bot merged commit a9ef237 into mozilla-mobile:master May 28, 2021
@grigoryk grigoryk deleted the syncTelemetryNew branch June 3, 2021 16:45
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
🛬 needs landing PRs that are ready to land
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Missing sync telemetry for credit cards, addresses, open tabs engines
3 participants