Skip to content
This repository has been archived by the owner on Nov 1, 2022. It is now read-only.

Closes #4556: Passwords sync telemetry support #5294

Merged
merged 5 commits into from
Dec 19, 2019

Conversation

grigoryk
Copy link
Contributor

This adds a new ping to support-telemetry-sync - passwords_sync, which is then used in services-firefox-accounts and in service-sync-logins to emit password sync telemetry whenever this engine is synchronized. New ping is a verbatim copy of the history_sync ping.

samples-sync app was also updated to demonstrate to how synchronize passwords, and how to protect encryption key at rest used for the password storage.


Pull Request checklist

  • Quality: This PR builds and passes detekt/ktlint checks (A pre-push hook is recommended)
  • Tests: This PR includes thorough tests or an explanation of why it does not
  • Changelog: This PR includes a changelog entry or does not need one
  • Accessibility: The code in this PR follows accessibility best practices or does not include any user facing features

After merge

  • Milestone: Make sure issues closed by this pull request are added to the milestone of the version currently in development.
  • Breaking Changes: If this is a breaking change, please push a draft PR on Reference Browser to address the breaking issues.

@grigoryk grigoryk force-pushed the issue4556loginsSyncPing branch from b402a88 to aecc833 Compare December 12, 2019 00:43
@codecov
Copy link

codecov bot commented Dec 12, 2019

Codecov Report

Merging #5294 into master will increase coverage by <.01%.
The diff coverage is 85.29%.

Impacted file tree graph

@@             Coverage Diff             @@
##             master   #5294      +/-   ##
===========================================
+ Coverage      80.3%   80.3%   +<.01%     
+ Complexity     4191    4184       -7     
===========================================
  Files           542     545       +3     
  Lines         19117   19136      +19     
  Branches       2760    2760              
===========================================
+ Hits          15352   15368      +16     
- Misses         2607    2616       +9     
+ Partials       1158    1152       -6
Impacted Files Coverage Δ Complexity Δ
...ponents/service/fxa/sync/WorkManagerSyncManager.kt 0% <0%> (ø) 0 <0> (ø) ⬇️
...components/support/sync/telemetry/SyncTelemetry.kt 91.08% <96.66%> (+2.35%) 34 <10> (+10) ⬆️
...s/browser/storage/memory/InMemoryHistoryStorage.kt
...mozilla/components/support/locale/LocaleManager.kt 97.61% <0%> (ø) 11% <0%> (?)
...va/mozilla/components/support/locale/Extensions.kt 75% <0%> (ø) 0% <0%> (?)
...nts/support/locale/LocaleAwareAppCompatActivity.kt 0% <0%> (ø) 0% <0%> (?)
...omponents/support/locale/LocaleAwareApplication.kt 0% <0%> (ø) 0% <0%> (?)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 54e790c...5d3f7f4. Read the comment docs.

@grigoryk
Copy link
Contributor Author

grigoryk commented Dec 12, 2019

Copied from #3092 (comment), as this telemetry is covering the same system. I've highlighted my changes to the original request.

Request for data collection review form

  1. What questions will you answer with this data?

We would like to measure the performance and correctness of our Rust sync implementation. This includes collecting the time taken to sync each data type (currently history and bookmarks and passwords), incoming and outgoing record counts, any errors that occur (reporting sanitized error messages in a string field), and, for bookmarks, tree structure problem counts.

With the exception of the error string, which does not contain PII, we're submitting timings and counts only.

  1. Why does Mozilla need to answer these questions? Are there benefits for users? Do we need this information to address product or business requirements?

We need to understand how our new Sync implementation behaves in the wild. The Sync ping in Firefox Desktop and Firefox for iOS already exists, and has been extremely valuable in identifying and diagnosing Sync issues.

This pull request collects the same information as the Sync ping, but ports its structure to Glean.

  1. What alternative methods did you consider to answer these questions? Why were they not sufficient?

Server-side metrics are not sufficient to understand Sync performance (especially for each step of an engine sync), given that the bulk of the work happens on clients. Validation data can only be collected on the client, since Sync records are encrypted and opaque to the server. The Sync ping for Desktop and iOS provides some stats about Desktop, but, since all three still use different Sync implementations, can't be extrapolated to Fenix.

  1. Can current instrumentation answer these questions?

Android Components consumers, including Fenix, do not currently report any Sync telemetry.
Android Components consumers, such as Fenix, currently include Sync telemetry for history and bookmarks. There is currently no similar telemetry for passwords.

  1. List all proposed measurements and indicate the category of data collection for each measurement, using the Firefox data collection categories on the Mozilla wiki.
Measurement Description Data Collection Category Tracking Bug #
**Timings, counts, and failure reasons for passwords syncs** Interaction data #4556
  1. How long will this data be collected? Choose one of the following:

I want to permanently monitor this data. (Grisha Kruglov, on behalf of the sync team, e.g. Lina Cambridge et al)

  1. What populations will you measure?

All users with sync enabled, in products that use service-firefox-accounts and service-sync-logins. Currently, these products include Fenix, Lockwise and Firefox Reality
All Sync users with Sync enabled.

The data is not correlated to the client_id; instead, we send a hash of the user's Firefox account ID (uid). This does not expose new identifiers, as these are already submitted in the Sync ping on other platforms.

  • Which release channels?

Presumably, all (see individual product owners for details on their sync integration).

  • Which countries?

Presumably, all (see individual product owners for details on their sync integration).

  • Which locales?

Presumably, all (see individual product owners for details on their sync integration).

  • Any other filters? Please describe in detail below.

Presumably, no (see individual product owners for details on their sync integration).

  1. If this data collection is default on, what is the opt-out mechanism for users?

Users can opt-out by disabling telemetry, or signing out of Sync.

  1. Please provide a general description of how you will analyze this data.

We will create queries in re:dash to monitor the engines, adding to our existing sync engine error/success (https://sql.telemetry.mozilla.org/dashboard/sync-leif-status-dashboard-wip) dashboards and engine error analysis notebook for Desktop (https://gist.github.com/mhammond/66684669e1478d65bd60446cf150c244).

  1. Where do you intend to share the results of your analysis?

See above.

  1. Is there a third-party tool (i.e. not Telemetry) that you are proposing to use for this data collection?

No.

@grigoryk grigoryk force-pushed the issue4556loginsSyncPing branch from aecc833 to 346b509 Compare December 12, 2019 01:02
@grigoryk grigoryk changed the title Passwords sync telemetry support Closes #4556: Passwords sync telemetry support Dec 12, 2019
@grigoryk grigoryk added the 🕵️‍♀️ needs review PRs that need to be reviewed label Dec 12, 2019
Copy link
Contributor

@rocketsroger rocketsroger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚢

@grigoryk grigoryk force-pushed the issue4556loginsSyncPing branch from 346b509 to 7a63bca Compare December 12, 2019 20:48
@grigoryk
Copy link
Contributor Author

@Dexterp37 I've addressed your comment regarding naming. This is now blocked on mozilla/glean_parser#146, as otherwise it won't compile since glean-parser currently doesn't consider passwords-sync to be a valid name. Do you know when parser support is expected to be updated in a-c?

@grigoryk
Copy link
Contributor Author

Glean parser update will land in #5301

Copy link
Contributor

@liuche liuche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

data-review+

Data Review Form (to be filled by Data Stewards)

  1. Is there or will there be documentation that describes the schema for the ultimate data set in a public, complete, and accurate way?

Yes, ping is documented in metrics.md, and the possible ping values for Password sync failures, timestamps, and related metadata are listed there.

  1. Is there a control mechanism that allows the user to turn the data collection on and off?

Yes, downstream consumers (such as Fenix) can control this data collection with the data controls provided by glean.

  1. If the request is for permanent data collection, is there someone who will monitor the data over time?

Yes, this includes automated tests for the pings being sent, and will be monitored by @grigoryk Grisha Kruglov

  1. Using the category system of data types on the Mozilla wiki, what collection type of data do the requested measurements fall under?

Type 2, sync behavior

  1. Is the data collection request for default-on or default-off?

Whatever consumer sets

  1. Does the instrumentation include the addition of any new identifiers (whether anonymous or otherwise; e.g., username, random IDs, etc. See the appendix for more details)?

Hashed FxA id

  1. Is the data collection covered by the existing Firefox privacy notice?

Yes

  1. Does there need to be a check-in in the future to determine whether to renew the data? (Yes/No) (If yes, set a todo reminder or file a bug if appropriate)**

This is permanent collection, but includes automated tests

  1. Does the data collection use a third-party collection tool? If yes, escalate to legal.

No

@grigoryk
Copy link
Contributor Author

Thanks, @liuche. Just waiting for #5301 to land now.

components/support/sync-telemetry/pings.yaml Outdated Show resolved Hide resolved
// so we check for the boolean flag that indicates if this happened or not.
// There's a complete mismatch between what Glean supports and what we need
// it to do here. They don't support "nested metrics" and so we resort to these hacks.
// There's a complete mismatch between what Glean supports and what we need it to do here.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit unfair, to be honest.

The sync team, the Glean teams and DS designed and agreed on the solution to send "one ping per engine sync". This is not, IMHO, a shortcoming of the SDK: this is doing what we all agreed on.

We can revisit the decision, of course.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For folks following along, we discussed this, and other feedback we had from the integration, in a Glean // Sync team meeting.

We also agreed on some outcomes to make this better, including:

  1. Tag each history_sync, bookmarks_sync, and logins_sync ping with a "sync UUID", that we can use to join these datasets. This is Add a per-sync UUID for telemetry mozilla/application-services#2381.
  2. Consider adding a top-level syncs ping, also tagged with the same UUID, so that we can report top-level errors (this will address the "reporting global errors multiple times" problem that this comment mentions) and denormalized stats (thanks @rfk for this wonderful suggestion after!)
  3. Investigate ways to reuse groups of metrics, to avoid the duplication that we see here. For example, instantiating a "template" engine ping for bookmarks, history, and so on, or a way to inherit between categories.
  4. Eventually, push metrics collection down into a-s using the Rust API, and remove these converters from a-c entirely. This would fix many of the clunky ergonomics around capturing metrics in a-s, then passing them up to a-c and unpacking them here.

I think when we agreed on the one-ping-per-engine approach before, we didn't know enough about how the integration would work, and settled on it as a short-term solution for getting telemetry unblocked. We were also coming to this with the mindset of "we've already done the Sync ping integration work on Desktop and iOS, why can't we bring it over," and, from that perspective, the lack of nested structs made things hard. That mindset was driven by time pressures of getting metrics in before the Fenix launch.

Now that we have more breathing room, we can improve this, and address all the points in this comment! 🚀

Copy link
Contributor

@linabutler linabutler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @grigoryk! I have a preference for naming it logins-sync, not passwords-sync, but I'll trust your judgement if you prefer the latter! 🚢:shipit:🚀

// so we check for the boolean flag that indicates if this happened or not.
// There's a complete mismatch between what Glean supports and what we need
// it to do here. They don't support "nested metrics" and so we resort to these hacks.
// There's a complete mismatch between what Glean supports and what we need it to do here.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For folks following along, we discussed this, and other feedback we had from the integration, in a Glean // Sync team meeting.

We also agreed on some outcomes to make this better, including:

  1. Tag each history_sync, bookmarks_sync, and logins_sync ping with a "sync UUID", that we can use to join these datasets. This is Add a per-sync UUID for telemetry mozilla/application-services#2381.
  2. Consider adding a top-level syncs ping, also tagged with the same UUID, so that we can report top-level errors (this will address the "reporting global errors multiple times" problem that this comment mentions) and denormalized stats (thanks @rfk for this wonderful suggestion after!)
  3. Investigate ways to reuse groups of metrics, to avoid the duplication that we see here. For example, instantiating a "template" engine ping for bookmarks, history, and so on, or a way to inherit between categories.
  4. Eventually, push metrics collection down into a-s using the Rust API, and remove these converters from a-c entirely. This would fix many of the clunky ergonomics around capturing metrics in a-s, then passing them up to a-c and unpacking them here.

I think when we agreed on the one-ping-per-engine approach before, we didn't know enough about how the integration would work, and settled on it as a short-term solution for getting telemetry unblocked. We were also coming to this with the mindset of "we've already done the Sync ping integration work on Desktop and iOS, why can't we bring it over," and, from that perspective, the lack of nested structs made things hard. That mindset was driven by time pressures of getting metrics in before the Fenix launch.

Now that we have more breathing room, we can improve this, and address all the points in this comment! 🚀

components/support/sync-telemetry/metrics.yaml Outdated Show resolved Hide resolved
components/support/sync-telemetry/metrics.yaml Outdated Show resolved Hide resolved
@grigoryk grigoryk force-pushed the issue4556loginsSyncPing branch from 7a63bca to 9791bea Compare December 19, 2019 20:11
@grigoryk grigoryk force-pushed the issue4556loginsSyncPing branch from 9791bea to 5d3f7f4 Compare December 19, 2019 20:13
@grigoryk
Copy link
Contributor Author

grigoryk commented Dec 19, 2019

Thanks for the reviews all, and thank you @linacambridge for the excellent summary of our discussions. @Dexterp37 my apologies for that comment - it was written from a somewhat myopic point of view, which didn't take into account the larger picture.

@grigoryk
Copy link
Contributor Author

bors r=rocketsroger,liuche

bors bot pushed a commit that referenced this pull request Dec 19, 2019
5285: Closes #5284: Adds progress bar to download notification r=pocmo a=sblatz



5294: Closes #4556: Passwords sync telemetry support r=rocketsroger,liuche a=grigoryk

This adds a new ping to `support-telemetry-sync` - passwords_sync, which is then used in `services-firefox-accounts` and in `service-sync-logins` to emit password sync telemetry whenever this engine is synchronized. New ping is a verbatim copy of the history_sync ping.

`samples-sync` app was also updated to demonstrate to how synchronize passwords, and how to protect encryption key at rest used for the password storage.



5360: Closes #5315 - Create a Top Sites storage component r=pocmo a=gabrielluong



Co-authored-by: Sawyer Blatz <sdblatz@gmail.com>
Co-authored-by: Grisha Kruglov <gkruglov@mozilla.com>
Co-authored-by: Gabriel Luong <gabriel.luong@gmail.com>
@bors
Copy link

bors bot commented Dec 19, 2019

This PR was included in a batch that timed out, it will be automatically retried

bors bot pushed a commit that referenced this pull request Dec 19, 2019
5294: Closes #4556: Passwords sync telemetry support r=rocketsroger,liuche a=grigoryk

This adds a new ping to `support-telemetry-sync` - passwords_sync, which is then used in `services-firefox-accounts` and in `service-sync-logins` to emit password sync telemetry whenever this engine is synchronized. New ping is a verbatim copy of the history_sync ping.

`samples-sync` app was also updated to demonstrate to how synchronize passwords, and how to protect encryption key at rest used for the password storage.



5360: Closes #5315 - Create a Top Sites storage component r=pocmo a=gabrielluong



Co-authored-by: Grisha Kruglov <gkruglov@mozilla.com>
Co-authored-by: Gabriel Luong <gabriel.luong@gmail.com>
@bors
Copy link

bors bot commented Dec 19, 2019

Build succeeded

  • complete-push

@bors bors bot merged commit 5d3f7f4 into mozilla-mobile:master Dec 19, 2019
@Dexterp37
Copy link
Contributor

@Dexterp37 my apologies for that comment - it was written from a somewhat myopic point of view, which didn't take into account the larger picture.

No problem! I'm glad we're all aligned now :-)

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
🕵️‍♀️ needs review PRs that need to be reviewed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants