Scaling issues with non-automated verification process #48

krgovind · 2021-08-10T14:28:41Z

Opening this issue to capture feedback from TAG member, @rhiaro:


Great to see expansion on this, but I'm still concerned about how this would work in practice. I acknowledge that it's (rightfully) still very open and subject to evolution. But I'm imagining either:

* there is one list that all UAs have come to agreement on policy on, and they all pull from the same list. I see the practicality of this, but it's dependent on UAs agreeing policy (not a guarantee) and for something with such a broad scope (any site on the web can play) feels far too much centralisation.
* sites have to submit their set for approval to *every* UA, which doesn't seem realistic. Even with just mainstream UAs right now, assuming they all implement FPS, I can see site admins submitting requests to MS, Apple, Mozilla and Google, but not bothering with Opera, Samsung, let alone Vivaldi, Brave, Tor... And what about mobile browsers? Whether or not requests need to be submitted to eg. Firefox mobile separately from Firefox desktop may not be obvious to site admins.
* the middle ground: some UAs agree policy and share a list, other UAs might fork the list, others might use the list but take liberties with how they implement it (eg. customise it). Besides fragmentation issues, this becomes confusing for site owners with regards to knowing where to submit their sets for verification.

Basically I'm just struggling to imagine how the verification process to determine whether sites are legitimately in the same set can reasonably scale to the whole web with this level of centralisation, and without [automation](https://github.com/privacycg/first-party-sets/issues/43).

_Originally posted by @rhiaro in https://github.com/privacycg/first-party-sets/pull/45#discussion_r656032925_

The text was updated successfully, but these errors were encountered:

krgovind · 2021-10-01T19:39:29Z

Hi @rhiaro - I'd like to bring your attention to this expanded proposal for the UA policy that we published since your feedback on PR #45; and I hope that it provides answers to your questions.

there is one list that all UAs have come to agreement on policy on, and they all pull from the same list. I see the practicality of this, but it's dependent on UAs agreeing policy (not a guarantee) and for something with such a broad scope (any site on the web can play) feels far too much centralisation.

Our proposal is indeed that all UAs come to agreement on the policy, and pull from the same list (see Responsibilities of the User Agent for more). The policy is inspired from prior art in the ecosystem, such as the definition of "party" in the DNT specification which was published as a W3C Candidate Recommendation in 2016; so we are hopeful that we can at least come to rough consensus on the principles, and then work through smaller/specific objections as they come up.

You also pointed out concerns about centralization, and scalability.

Regarding the centralization concern, I was curious if you think that the challenges here are substantially different from the lists that are used elsewhere in the web platform, such as the Public Suffix List, or HSTS Preload list? Or perhaps the feedback is based on issues you've seen with those lists? The "Responsibilities of the User Agent" section that I cited above intends to address some of the existing challenges with those lists.

Regarding the scalability concern, we hope that our expanded UA policy proposal alleviates this. Specifically, if you look at the section on the responsibilities of the enforcement entity, we are proposing that the entity doesn't have to check every single request manually, but only perform random spot checks. Technical consistency checks will be indeed be performed on each request however; and to the extent possible, we would like to automate as many checks as possible.

If I understand correctly, I believe that the TAG's preferences are around a technical-only mechanism. In fact, that is exactly where we started with the original version of this proposal. However, we eventually introduced the UA policy to address concerns around potential for abuse, which was pointed out to us by other browsers in #6 and #7. A technical-only mechanism was also contrasted with the analogous Disconnect Entities list that is currently used by Firefox and Edge in their default Tracking Protection modes. This list is curated based on a policy that is documented here, and you will see some similarities around how tracking is defined as (emphasis mine) "the collection of data regarding a particular user's or device's activity across multiple websites or applications that aren’t owned by the data collector, and the retention, use or sharing of that data.".

We hope to strike a balance between scalability, and abuse-resistance by having acceptances primarily based on self-attestations and technical checks; along with supplemental accountability measures such as a publicly auditable log, random spot checks, and a mechanism for users and civil society to report potentially invalid or policy-violating sets. We think that the public self-attestations will play an important role in deterring abuse, because as footnote#1 in this section points out, "[Public] Misrepresentations about an entity's ownership/control of a site that lead to the collection of user data outside of the First Party Sets policy would be enforceable in the same way that misrepresentations or misleading statements in privacy policies are."

Add IEE role in surveys of users to check that they understand common identity. (It would be impractical to leave this to the browser and site author, especially in cases where the browser and site author have a business relationship that would be influenced by FPS validity or invalidity.) Refs WICG#43 WICG#48 WICG#64 WICG#76

krgovind · 2023-01-23T23:33:31Z

@rhiaro - I think my previous answer is due for an update. We significantly updated this proposal based on ecosystem feedback (summary in #92).

Regarding the concern about whether different UAs can come to an agreement on sharing the same set. Note that the update now requires the use of either requestStorageAccess or requestStorageAccessForOrigin.
- The former is already shipping in a few major browsers, and relies on user-agent-defined heuristics/allowlists/other logic to determine how the browser mediates this request for access to unpartitioned cookies. We propose that the FPS list be incorporated into this user-agent-defined logic. We anticipate user agents that don't support FPS will fall back on their own set of heuristics/allowlists.
- requestStorageAccessForOrigin allows for similar user-agent-defined logic, and we are currently incubating that proposal within the PrivacyCG where we hope to align with non-FPS supporting browsers on how to avoid Prompt Spam where First Party Sets are not Used privacycg/requestStorageAccessFor#2.
- Note that we intend to make the list available for use by any user agents that intend to implement FPS; so user agents may also choose to use FPS to augment their user-agent-defined logic; such as improving user prompts by displaying FPS-declared information.
Regarding scaling to the web, we have further eliminated the need for spot checks by completely automating the process of validating sets. These detailed Submission Guidelines explain the process of submitting a set. We rely on a combination of technical checks (e.g. verify administrative control over sites by looking for .well-known files, check that a given list of sites are ccTLD variants of each by comparing against the known list of ccTLDs on the Public Suffix List, placing a numeric limit of 3 domains for the associative subset, etc.)

Please let us know if you see any unresolved issues with the latest version of the proposal.

krgovind mentioned this issue Aug 10, 2021

Add usecases, applications, acceptance process, etc. #45

Merged

dmarti mentioned this issue Aug 12, 2021

MERS for web properties #49

Open

krgovind mentioned this issue Oct 1, 2021

Related Website Sets (formerly First-Party Sets) w3ctag/design-reviews#342

Closed

5 tasks

brownwolf1355 mentioned this issue Jan 11, 2022

Discuss FPS membership standards #76

Open

dmarti mentioned this issue Jan 13, 2022

Checking user understanding of shared identity #78

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Scaling issues with non-automated verification process #48

Scaling issues with non-automated verification process #48

krgovind commented Aug 10, 2021

krgovind commented Oct 1, 2021

krgovind commented Jan 23, 2023

Scaling issues with non-automated verification process #48

Scaling issues with non-automated verification process #48

Comments

krgovind commented Aug 10, 2021

krgovind commented Oct 1, 2021

krgovind commented Jan 23, 2023