Feedback from Mozilla #158

annevk · 2019-05-14T14:35:19Z

We'd like to support reporting beyond CSP's report-uri in Firefox, but are uncomfortable with the design of the Reporting API as-is. Our proposal is that for v1 of the feature its functionality is strictly equivalent to that of CSP's report-uri. More specifically:

Remove the storage functionality and tie reporting policies to documents and workers directly. The lifetime of any reporting URL will then match that of the document or worker, equivalent to CSP. This avoids introducing new tracking vectors and adding significant complexity to v1 of this feature. Our suggestion is that Origin Policy/Manifest is flushed out further for any and all origin-wide policies, provided the tracking issues can be solved.
Remove the failover functionality. This isn't needed for a v1 of the API and goes beyond what we offer for equivalent APIs elsewhere.
Switch from JSON to Structured Headers for HTTP while we are making breaking changes. There appears to be quite broad support for making Structured Headers a thing, whereas JSON headers do not have that support.

This would enable Firefox to ship this functionality and start generating reports for various existing and new features (e.g., we've gotten feedback that reporting for Cross-Origin-Opener-Policy is rather essential).

igrigorik · 2019-05-14T15:17:34Z

That's great news, glad you guys are investigating adding support! Would love to work together to address any questions or issues.

To start, a bit of context.. The intent and goal of Reporting is to provide shared infrastructure to different report generators. As of today, that already includes Network Error Logging, Feature Policy, CSP3, Web Package, and there is probably a few more that I'm forgetting.

With that in mind, functionality such as ability to register a report policy is a must have requirement for NEL and ditto for failover: these requirements were informed and motivated by long history of production experience with Chrome's "domain reliability" precursor. Further, I'll note that there are currently some features defined by NEL (e.g. sampling rate) that we're hearing developers asking us to move to Reporting for use cases like CSP as well — "we don't need every single violation report".

Which is to say, based on already existing integrations, I don't think we can simply drop all of that functionality on the floor. That said, perhaps we can define a path where Firefox is able to incrementally adopt a subset of said functionality? E.g. as a start, enable CSP reporting via Reporting.

annevk · 2019-05-14T15:41:39Z

From those, it seems that NEL has very domain-specific requirements. It also seems it has some kind of same-site requirement that the others do not share if I'm reading things correctly. It's not immediately clear to me how a shared abstraction makes sense if document policies work for the others?

And yeah, part of trying to move this along is unblocking whatwg/html#4622 and our Feature Policy work (though given the scope of recently discussed changes there the exact integration with the Reporting API might well change) which could both use this.

johnwilander · 2019-06-14T19:47:20Z

We (Apple) would like to express our support for removing the Storage part. For such a functionality to be feasible in browsers with modern privacy protections, partitioning of the storage would have to be integrated which might defeat its intended purpose.

dcreager · 2019-06-26T15:24:29Z

I'd be in favor of letting Origin Policy take over the responsibility of maintaining the Reporting and NEL policies for an origin, using caching headers on the Origin Policy to take the place of our max_age fields. Though until it's ready, I'd want there to still be some language that allows Chrome's current behavior.

In the meantime, I think the current spec includes enough flexibility to let Mozilla and Apple implement something less persistent. For instance, the Garbage collection section currently allows UAs to drop endpoints and reports after deadlines that can be arbitrarily small. Chrome's current implementation keeps its cache purely in memory, which means that Reporting and NEL policies don't persist across browser restarts. (Though there's ongoing work to change that.)

Would you all find it acceptable to add clarifying language that (a) a user agent is free to drop an endpoint or report from the cache at any time (making it more explicitly okay to implement "not really caches"), and (b) we want to eventually migrate the storage responsibilities over to Origin Policy or something similar?

(Note that there would still need to be some kind of cache for the reports themselves, since those are batched up and sent at some point in the future, and not as a blocking or immediate part of servicing the request that generates the reports.)

clelland · 2019-07-05T18:13:19Z

I agree that in practice there's a need for an out-of-document cache of some sort, especially to enable things like NEL and crash reporting. The spec doesn't need to enforce that though.

I think that framing the reporting mechanism itself as best-effort, non-guaranteed delivery helps here. If agents are permitted to drop reports for any reason -- including not sending reports that may impact user privacy -- then implementers should have leeway to experiment in that space.

Are the concerns around privacy any broader than what has been spelled out in the spec? (Specifically in 11.1 and 11.5)

Could we call out in a spec note that the requirements on the reporting cache specifically allow for a per-document or per-origin cache, that it may persist only in RAM, or may be destroyed along with the document, and that its only purpose is to enable batching of reports as the user agent sees fit, and to increase reliability by allowing failed deliveries to be retried?

annevk · 2019-07-15T13:39:59Z

I think it's problematic if we have a shared feature that works wildly different across user agents. Given the existing market dynamics that will result in a set of bug reports against Firefox and Safari, basically (at best).

Also, when I look at the complete feature set and what we'd like the difference is quite stark. The way I view it is that a document should be able to declare zero or more "report endpoints", each with a "name" and "url".

I'd much rather support this via a new header that can use the structured value syntax (as likely recommended by the HTTP WG) than something custom.

igrigorik · 2019-07-24T03:54:40Z

@annevk do you have any suggestions for how we could accomplish what you've described while still enabling NEL + crash reporting to exist, without having to rebuild their own version of Reporting?

annevk · 2019-08-08T15:10:57Z

I'm not sure why crashes could not be per-document. We have been thinking about OOM reporting at least and we don't think we need any kind of origin-wide store for that. We can use the header that came with the response that created the agent cluster that OOM'd.

For NEL I'm not sure as it requires addressing the tracking issue somehow (perhaps restrict to top-level + cookie lifetime), but I don't see why having separate registration syntax is prohibitive. It can reuse all of the other infrastructure as far as I can tell.

igrigorik · 2019-08-14T00:04:28Z

Re, crashes: yeah, I think you're right. I'm trying to remember why I originally included that in the set, but the reason escapes me at the moment.

For NEL I'm not sure as it requires addressing the tracking issue somehow (perhaps restrict to top-level + cookie lifetime).

Hmm, interesting.

Re, cooking lifetime: there might be an interesting parallel here to where we landed for CH based on discussions at IETF 109 (see httpwg/http-extensions#878), where we opted to drop explicit lifetime in favor of implicit opt-in registration and persistence. Perhaps there is a similar (same?) argument to be made here? /cc @yoavweiss @martinthomson

Re, top-level: can you elaborate on this one a bit more?

I don't see why having separate registration syntax is prohibitive. It can reuse all of the other infrastructure as far as I can tell.

Well, the point of Reporting was to eliminate the need for different registration syntax between features, so there's that. However, I think the larger issue here is related to failover, retries, etc. In your original comment you noted that it's not necessary for v1 of API, but that is functionality that (at least when NEL was designed and as informed by "domain reliability" implementation experience in Chrome) is necessary for NEL.. I think that's the one we need to unpack and come to consensus on.

martinthomson · 2019-08-14T01:00:36Z

I've personally come to the conclusion that lifetimes on persistence no longer need to be signaled by sites. Any expiration that the site needs to maintain can be opaque to the browser. Where there are space pressures (like cookies), the browser needs discretion on how to drop state, and for that expiration times don't really help.

annevk · 2019-08-21T14:27:18Z

To be clear, I don't really know how to solve NEL, it seems about as hard as Origin Policy and we haven't landed on anything concrete there thus far.

I'd love to unblock generalized per-document reporting (i.e., beyond CSP) and the general reporting format though as there are many features that could make use of that in a privacy-preserving manner.

clelland · 2019-08-21T14:51:39Z

I certainly don't see a good way to accomplish the goals of NEL without going to something like Origin Policy (which would at least make all of the potential problems explicit ones)

If we were to use this spec to define a common infrastructure for more ephemeral reporting, but leave the algorithms open for use by NEL, would that unblock you?

(I'd like to keep the option open for batching and retrying out-of-band report delivery, but that's a different issue than the tracking concerns around NEL)

annevk · 2019-08-21T15:25:27Z

Well, not really, see #158 (comment). But also, the nature of batching and out-of-band delivery can definitely have privacy implications that CSP's model does not have.

dcreager · 2019-08-21T15:32:59Z

But also, the nature of batching and out-of-band delivery can definitely have privacy implications that CSP's model does not have.

Can you elaborate on this part? The report batching respects origin boundaries — each report upload will only ever contain reports about a single origin. (It has to, to be able to participate in the CORS preflight check.) And Chrome's implementation currently throws away all undelivered reports whenever the network settings change, for instance, so that the delay between report generation and delivery doesn't leak any information. There's a note about that here, but we could strengthen that language to make it a requirement instead of a recommendation.

annevk · 2019-08-21T15:35:38Z

If you batch A nested in top-level B with top-level A you have some kind of tracking vector. If you deliver them sometime after the users closes all As you tell A something about when the user is using the browser.

clelland · 2019-08-28T19:51:34Z

Can we at least batch per-document? If a single page view triggers several deprecation reports, for instance, it seems only positive for the browser to wait a minute, and then deliver them in a single HTTP request (gaining advantages from compression and fewer TCP round-trips) rather than delivering each in a single request as they occur.

Slightly better, and hopefully still unproblematic would be batching reports from the same origin within an agent cluster -- if they can script each other, or even postMessage, then they can already infer any information about the user that a batch report delivery could provide.

annevk · 2019-08-29T09:22:40Z

I think that's probably fine, with the latest possible delivery time matching Fetch's keepalive.

clelland · 2019-09-18T00:31:43Z

As a concrete proposal for moving forward on this, we discussed the following steps at TPAC
(see rough slides here):

Remove persistent storage of reporting configuration from this spec
Restrict batching to just reports from same-origin documents within the same agent cluster
Remove max_age and include_subdomains from Report-To configuration
Use a different configuration for Network Error Logging (probably Origin Policy, but that's an issue for NEL, not Reporting)
Attempt to switch to SH (see Switch reporting to Structured Headers #177)

jpchase · 2019-09-19T17:55:51Z

As a concrete proposal for moving forward on this, we discussed the following steps at TPAC
(see rough slides here):

Remove persistent storage of reporting configuration from this spec

Restrict batching to just reports from same-origin documents within the same agent cluster

Remove max_age and include_subdomains from Report-To configuration

Use a different configuration for Network Error Logging (probably Origin Policy, but that's an issue for NEL, not Reporting)

I thought the suggestion was to pull out persistent storage of configuration into a separate doc for incubation, and later decide whether to merge back into Reporting (from the TPAC notes). This sounds like it's making persistent storage a thing only for the NEL spec?

clelland · 2019-09-20T02:53:47Z

No, this is definitely not making persistence NEL-only. The building blocks for persistent reporting are likely to go into a couple of different places:

Reporting cache (not associated with any document), max_age, and any other delivery considerations for out-of-document reports, such as more advanced batching and failover, privacy consideration for persistence, etc. will all move into a separate document in this repo (as you describe)
The concept of persistent origin-wide configuration (separate from reporting specifically) is the domain of origin policy.
NEL's configuration in particular will be described in NEL, using concepts from Reporting and from Origin Policy.
Other persistent reporting specs may similarly want to describe their own configuration mechanisms in their own spec, using the same building blocks that NEL uses.

This change removes the concepts which are only relevant to persistent out-of-band reporting specs, such as Network Error Logging, and explicitly defines reporting mechansims for documents, where the generated reports are queued with the document itself, and thereby inherit its lifetime. Generic reporting concepts are extracted into another section so that they can be used by other specs. Addresses issues raised in #158.

clelland · 2020-01-10T18:22:24Z

I've split the reporting spec into two parts (#191); take a look at the pull request and feel free to comment there (or here) about the structure. Roughly, Reporting covers the core concepts, ReportingObserver, and what I've labeled "document-centered reporting", which was the original request in #158 (comment).

The other spec is currently "Network Reporting", and extends Reporting by adding back the reporting cache, and defining different cache and delivery algorithms. To keep them separate, configuration for these reports is handled exclusively through Origin Policy.

annevk · 2020-01-13T09:52:58Z

From #158 (comment) I see 1 and 3 in that PR (and I guess 4, did not look at that document). I don't really see batching described or 5. And wasn't failover also going to be an extension of sorts?

clelland · 2020-01-13T15:17:48Z

5 was on a separate branch; I've published that as https://w3c.github.io/reporting/structured-headers/ (based on the branch at https://github.com/w3c/reporting/tree/structured-headers). I should probably merge those together, but didn't want to conflate all of the issues.

You're right about failover -- I didn't pull that out yet. I'll work on that today, to see what the end result looks like.

Restricted batching, I believe, is implied in this version by the fact that reports are owned by documents, and not pooled together across documents.

annevk · 2020-01-13T16:01:50Z

Ah okay, but I thought we had agreed that a limited form of batching was agreeable? But yeah, maybe it's best to build that on top of this as defining that might be somewhat involved.

clelland · 2020-01-13T16:45:14Z

I raised the question of removing failover / load balancing completely in #196; the most extreme version of that would seem to remove the need for everything other than a single URL to define an endpoint, but I want to get some feedback on whether that's going to far for existing use cases.

I'd still like to be able to batch across same-origin documents, in the same agent cluster, for sure. It seems like a net efficiency win, without any real issues. Does that feel like the sort of thing that can be added on top?

annevk · 2020-01-13T16:55:25Z

Same origin within an agent cluster, though maybe even same origin + same top-level origin within a user agent.

clelland · 2020-01-14T15:30:22Z

Double-keying like that could work as well

* Remove persistent non-document reporting from spec. This change removes the concepts which are only relevant to persistent out-of-band reporting specs, such as Network Error Logging, and explicitly defines reporting mechanisms for documents, where the generated reports are queued with the document itself, and thereby inherit its lifetime. Generic reporting concepts are extracted into another section so that they can be used by other specs. This removes endpoint weight and priority from the basic reporting spec, and creates a 'network reporting endpoint' concept for network reporting. Endpoint groups are similarly moved to network reporting, as there is no longer any need to group endpoints otherwise. Retries and failover are removed from basic reporting. Addresses issues raised in #158.

clelland · 2020-03-04T20:55:12Z

Structured headers is now in the main spec text as well, as of 7da9f80.

I think that addresses all of the concerns raised in this issue. I'll look at adding some text re: batching within an agent cluster in a separate issue.

annevk · 2020-03-06T12:41:44Z

Thanks for all your work here @clelland! Much appreciated! And indeed, let's close this and use more specific issues for further work.

annevk mentioned this issue May 14, 2019

Reporting API mozilla/standards-positions#104

Closed

pes10k mentioned this issue Jul 18, 2019

Privacy concerns regarding Reporting API #169

Closed

clelland mentioned this issue Sep 15, 2019

Switch reporting to Structured Headers #177

Closed

clelland mentioned this issue Dec 18, 2019

Remove persistent non-document reporting from spec. #191

Merged

annevk closed this as completed Mar 6, 2020

samuelweiler mentioned this issue Dec 3, 2020

WebPerf charter review w3c/strategy#237

Closed

6 tasks

clelland mentioned this issue Dec 8, 2020

Reporting w3ctag/design-reviews#585

Closed

1 task

clelland mentioned this issue Feb 12, 2021

Batch reports from related documents? #226

Closed

clelland mentioned this issue Dec 4, 2023

Distribute NEL policies with DNS w3c/network-error-logging#173

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feedback from Mozilla #158

Feedback from Mozilla #158

annevk commented May 14, 2019

igrigorik commented May 14, 2019

annevk commented May 14, 2019

johnwilander commented Jun 14, 2019

dcreager commented Jun 26, 2019 •

edited

Loading

clelland commented Jul 5, 2019

annevk commented Jul 15, 2019

igrigorik commented Jul 24, 2019

annevk commented Aug 8, 2019

igrigorik commented Aug 14, 2019

martinthomson commented Aug 14, 2019

annevk commented Aug 21, 2019 •

edited

Loading

clelland commented Aug 21, 2019

annevk commented Aug 21, 2019

dcreager commented Aug 21, 2019

annevk commented Aug 21, 2019

clelland commented Aug 28, 2019

annevk commented Aug 29, 2019

clelland commented Sep 18, 2019

jpchase commented Sep 19, 2019 •

edited

Loading

clelland commented Sep 20, 2019

clelland commented Jan 10, 2020

annevk commented Jan 13, 2020

clelland commented Jan 13, 2020

annevk commented Jan 13, 2020

clelland commented Jan 13, 2020

annevk commented Jan 13, 2020

clelland commented Jan 14, 2020

clelland commented Mar 4, 2020

annevk commented Mar 6, 2020

Feedback from Mozilla #158

Feedback from Mozilla #158

Comments

annevk commented May 14, 2019

igrigorik commented May 14, 2019

annevk commented May 14, 2019

johnwilander commented Jun 14, 2019

dcreager commented Jun 26, 2019 • edited Loading

clelland commented Jul 5, 2019

annevk commented Jul 15, 2019

igrigorik commented Jul 24, 2019

annevk commented Aug 8, 2019

igrigorik commented Aug 14, 2019

martinthomson commented Aug 14, 2019

annevk commented Aug 21, 2019 • edited Loading

clelland commented Aug 21, 2019

annevk commented Aug 21, 2019

dcreager commented Aug 21, 2019

annevk commented Aug 21, 2019

clelland commented Aug 28, 2019

annevk commented Aug 29, 2019

clelland commented Sep 18, 2019

jpchase commented Sep 19, 2019 • edited Loading

clelland commented Sep 20, 2019

clelland commented Jan 10, 2020

annevk commented Jan 13, 2020

clelland commented Jan 13, 2020

annevk commented Jan 13, 2020

clelland commented Jan 13, 2020

annevk commented Jan 13, 2020

clelland commented Jan 14, 2020

clelland commented Mar 4, 2020

annevk commented Mar 6, 2020

dcreager commented Jun 26, 2019 •

edited

Loading

annevk commented Aug 21, 2019 •

edited

Loading

jpchase commented Sep 19, 2019 •

edited

Loading