-
Notifications
You must be signed in to change notification settings - Fork 473
Closes #4556: Passwords sync telemetry support #5294
Conversation
b402a88
to
aecc833
Compare
Codecov Report
@@ Coverage Diff @@
## master #5294 +/- ##
===========================================
+ Coverage 80.3% 80.3% +<.01%
+ Complexity 4191 4184 -7
===========================================
Files 542 545 +3
Lines 19117 19136 +19
Branches 2760 2760
===========================================
+ Hits 15352 15368 +16
- Misses 2607 2616 +9
+ Partials 1158 1152 -6
Continue to review full report at Codecov.
|
Copied from #3092 (comment), as this telemetry is covering the same system. I've highlighted my changes to the original request. Request for data collection review form
We would like to measure the performance and correctness of our Rust sync implementation. This includes collecting the time taken to sync each data type (currently history and bookmarks and passwords), incoming and outgoing record counts, any errors that occur (reporting sanitized error messages in a With the exception of the error string, which does not contain PII, we're submitting timings and counts only.
We need to understand how our new Sync implementation behaves in the wild. The Sync ping in Firefox Desktop and Firefox for iOS already exists, and has been extremely valuable in identifying and diagnosing Sync issues. This pull request collects the same information as the Sync ping, but ports its structure to Glean.
Server-side metrics are not sufficient to understand Sync performance (especially for each step of an engine sync), given that the bulk of the work happens on clients. Validation data can only be collected on the client, since Sync records are encrypted and opaque to the server. The Sync ping for Desktop and iOS provides some stats about Desktop, but, since all three still use different Sync implementations, can't be extrapolated to Fenix.
I want to permanently monitor this data. (Grisha Kruglov, on behalf of the sync team, e.g. Lina Cambridge et al)
All users with sync enabled, in products that use service-firefox-accounts and service-sync-logins. Currently, these products include Fenix, Lockwise and Firefox Reality The data is not correlated to the
Presumably, all (see individual product owners for details on their sync integration).
Presumably, all (see individual product owners for details on their sync integration).
Presumably, all (see individual product owners for details on their sync integration).
Presumably, no (see individual product owners for details on their sync integration).
Users can opt-out by disabling telemetry, or signing out of Sync.
We will create queries in re:dash to monitor the engines, adding to our existing sync engine error/success (https://sql.telemetry.mozilla.org/dashboard/sync-leif-status-dashboard-wip) dashboards and engine error analysis notebook for Desktop (https://gist.github.com/mhammond/66684669e1478d65bd60446cf150c244).
See above.
No. |
aecc833
to
346b509
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🚢
346b509
to
7a63bca
Compare
@Dexterp37 I've addressed your comment regarding naming. This is now blocked on mozilla/glean_parser#146, as otherwise it won't compile since glean-parser currently doesn't consider |
Glean parser update will land in #5301 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
data-review+
Data Review Form (to be filled by Data Stewards)
- Is there or will there be documentation that describes the schema for the ultimate data set in a public, complete, and accurate way?
Yes, ping is documented in metrics.md, and the possible ping values for Password sync failures, timestamps, and related metadata are listed there.
- Is there a control mechanism that allows the user to turn the data collection on and off?
Yes, downstream consumers (such as Fenix) can control this data collection with the data controls provided by glean.
- If the request is for permanent data collection, is there someone who will monitor the data over time?
Yes, this includes automated tests for the pings being sent, and will be monitored by @grigoryk Grisha Kruglov
- Using the category system of data types on the Mozilla wiki, what collection type of data do the requested measurements fall under?
Type 2, sync behavior
- Is the data collection request for default-on or default-off?
Whatever consumer sets
- Does the instrumentation include the addition of any new identifiers (whether anonymous or otherwise; e.g., username, random IDs, etc. See the appendix for more details)?
Hashed FxA id
- Is the data collection covered by the existing Firefox privacy notice?
Yes
- Does there need to be a check-in in the future to determine whether to renew the data? (Yes/No) (If yes, set a todo reminder or file a bug if appropriate)**
This is permanent collection, but includes automated tests
- Does the data collection use a third-party collection tool? If yes, escalate to legal.
No
// so we check for the boolean flag that indicates if this happened or not. | ||
// There's a complete mismatch between what Glean supports and what we need | ||
// it to do here. They don't support "nested metrics" and so we resort to these hacks. | ||
// There's a complete mismatch between what Glean supports and what we need it to do here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a bit unfair, to be honest.
The sync team, the Glean teams and DS designed and agreed on the solution to send "one ping per engine sync". This is not, IMHO, a shortcoming of the SDK: this is doing what we all agreed on.
We can revisit the decision, of course.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For folks following along, we discussed this, and other feedback we had from the integration, in a Glean // Sync team meeting.
We also agreed on some outcomes to make this better, including:
- Tag each
history_sync
,bookmarks_sync
, andlogins_sync
ping with a "sync UUID", that we can use to join these datasets. This is Add a per-sync UUID for telemetry mozilla/application-services#2381. - Consider adding a top-level
syncs
ping, also tagged with the same UUID, so that we can report top-level errors (this will address the "reporting global errors multiple times" problem that this comment mentions) and denormalized stats (thanks @rfk for this wonderful suggestion after!) - Investigate ways to reuse groups of metrics, to avoid the duplication that we see here. For example, instantiating a "template" engine ping for bookmarks, history, and so on, or a way to inherit between categories.
- Eventually, push metrics collection down into a-s using the Rust API, and remove these converters from a-c entirely. This would fix many of the clunky ergonomics around capturing metrics in a-s, then passing them up to a-c and unpacking them here.
I think when we agreed on the one-ping-per-engine approach before, we didn't know enough about how the integration would work, and settled on it as a short-term solution for getting telemetry unblocked. We were also coming to this with the mindset of "we've already done the Sync ping integration work on Desktop and iOS, why can't we bring it over," and, from that perspective, the lack of nested structs made things hard. That mindset was driven by time pressures of getting metrics in before the Fenix launch.
Now that we have more breathing room, we can improve this, and address all the points in this comment! 🚀
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @grigoryk! I have a preference for naming it logins-sync
, not passwords-sync
, but I'll trust your judgement if you prefer the latter! 🚢:shipit:🚀
// so we check for the boolean flag that indicates if this happened or not. | ||
// There's a complete mismatch between what Glean supports and what we need | ||
// it to do here. They don't support "nested metrics" and so we resort to these hacks. | ||
// There's a complete mismatch between what Glean supports and what we need it to do here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For folks following along, we discussed this, and other feedback we had from the integration, in a Glean // Sync team meeting.
We also agreed on some outcomes to make this better, including:
- Tag each
history_sync
,bookmarks_sync
, andlogins_sync
ping with a "sync UUID", that we can use to join these datasets. This is Add a per-sync UUID for telemetry mozilla/application-services#2381. - Consider adding a top-level
syncs
ping, also tagged with the same UUID, so that we can report top-level errors (this will address the "reporting global errors multiple times" problem that this comment mentions) and denormalized stats (thanks @rfk for this wonderful suggestion after!) - Investigate ways to reuse groups of metrics, to avoid the duplication that we see here. For example, instantiating a "template" engine ping for bookmarks, history, and so on, or a way to inherit between categories.
- Eventually, push metrics collection down into a-s using the Rust API, and remove these converters from a-c entirely. This would fix many of the clunky ergonomics around capturing metrics in a-s, then passing them up to a-c and unpacking them here.
I think when we agreed on the one-ping-per-engine approach before, we didn't know enough about how the integration would work, and settled on it as a short-term solution for getting telemetry unblocked. We were also coming to this with the mindset of "we've already done the Sync ping integration work on Desktop and iOS, why can't we bring it over," and, from that perspective, the lack of nested structs made things hard. That mindset was driven by time pressures of getting metrics in before the Fenix launch.
Now that we have more breathing room, we can improve this, and address all the points in this comment! 🚀
7a63bca
to
9791bea
Compare
It is a verbatim copy of the HistoryPing.
9791bea
to
5d3f7f4
Compare
Thanks for the reviews all, and thank you @linacambridge for the excellent summary of our discussions. @Dexterp37 my apologies for that comment - it was written from a somewhat myopic point of view, which didn't take into account the larger picture. |
bors r=rocketsroger,liuche |
5285: Closes #5284: Adds progress bar to download notification r=pocmo a=sblatz 5294: Closes #4556: Passwords sync telemetry support r=rocketsroger,liuche a=grigoryk This adds a new ping to `support-telemetry-sync` - passwords_sync, which is then used in `services-firefox-accounts` and in `service-sync-logins` to emit password sync telemetry whenever this engine is synchronized. New ping is a verbatim copy of the history_sync ping. `samples-sync` app was also updated to demonstrate to how synchronize passwords, and how to protect encryption key at rest used for the password storage. 5360: Closes #5315 - Create a Top Sites storage component r=pocmo a=gabrielluong Co-authored-by: Sawyer Blatz <sdblatz@gmail.com> Co-authored-by: Grisha Kruglov <gkruglov@mozilla.com> Co-authored-by: Gabriel Luong <gabriel.luong@gmail.com>
This PR was included in a batch that timed out, it will be automatically retried |
5294: Closes #4556: Passwords sync telemetry support r=rocketsroger,liuche a=grigoryk This adds a new ping to `support-telemetry-sync` - passwords_sync, which is then used in `services-firefox-accounts` and in `service-sync-logins` to emit password sync telemetry whenever this engine is synchronized. New ping is a verbatim copy of the history_sync ping. `samples-sync` app was also updated to demonstrate to how synchronize passwords, and how to protect encryption key at rest used for the password storage. 5360: Closes #5315 - Create a Top Sites storage component r=pocmo a=gabrielluong Co-authored-by: Grisha Kruglov <gkruglov@mozilla.com> Co-authored-by: Gabriel Luong <gabriel.luong@gmail.com>
Build succeeded
|
No problem! I'm glad we're all aligned now :-) |
This adds a new ping to
support-telemetry-sync
- passwords_sync, which is then used inservices-firefox-accounts
and inservice-sync-logins
to emit password sync telemetry whenever this engine is synchronized. New ping is a verbatim copy of the history_sync ping.samples-sync
app was also updated to demonstrate to how synchronize passwords, and how to protect encryption key at rest used for the password storage.Pull Request checklist
After merge