-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feedback from Mozilla #158
Comments
That's great news, glad you guys are investigating adding support! Would love to work together to address any questions or issues. To start, a bit of context.. The intent and goal of Reporting is to provide shared infrastructure to different report generators. As of today, that already includes Network Error Logging, Feature Policy, CSP3, Web Package, and there is probably a few more that I'm forgetting. With that in mind, functionality such as ability to register a report policy is a must have requirement for NEL and ditto for failover: these requirements were informed and motivated by long history of production experience with Chrome's "domain reliability" precursor. Further, I'll note that there are currently some features defined by NEL (e.g. sampling rate) that we're hearing developers asking us to move to Reporting for use cases like CSP as well — "we don't need every single violation report". Which is to say, based on already existing integrations, I don't think we can simply drop all of that functionality on the floor. That said, perhaps we can define a path where Firefox is able to incrementally adopt a subset of said functionality? E.g. as a start, enable CSP reporting via Reporting. |
From those, it seems that NEL has very domain-specific requirements. It also seems it has some kind of same-site requirement that the others do not share if I'm reading things correctly. It's not immediately clear to me how a shared abstraction makes sense if document policies work for the others? And yeah, part of trying to move this along is unblocking whatwg/html#4622 and our Feature Policy work (though given the scope of recently discussed changes there the exact integration with the Reporting API might well change) which could both use this. |
We (Apple) would like to express our support for removing the Storage part. For such a functionality to be feasible in browsers with modern privacy protections, partitioning of the storage would have to be integrated which might defeat its intended purpose. |
I'd be in favor of letting Origin Policy take over the responsibility of maintaining the Reporting and NEL policies for an origin, using caching headers on the Origin Policy to take the place of our In the meantime, I think the current spec includes enough flexibility to let Mozilla and Apple implement something less persistent. For instance, the Garbage collection section currently allows UAs to drop endpoints and reports after deadlines that can be arbitrarily small. Chrome's current implementation keeps its cache purely in memory, which means that Reporting and NEL policies don't persist across browser restarts. (Though there's ongoing work to change that.) Would you all find it acceptable to add clarifying language that (a) a user agent is free to drop an endpoint or report from the cache at any time (making it more explicitly okay to implement "not really caches"), and (b) we want to eventually migrate the storage responsibilities over to Origin Policy or something similar? (Note that there would still need to be some kind of cache for the reports themselves, since those are batched up and sent at some point in the future, and not as a blocking or immediate part of servicing the request that generates the reports.) |
I agree that in practice there's a need for an out-of-document cache of some sort, especially to enable things like NEL and crash reporting. The spec doesn't need to enforce that though. I think that framing the reporting mechanism itself as best-effort, non-guaranteed delivery helps here. If agents are permitted to drop reports for any reason -- including not sending reports that may impact user privacy -- then implementers should have leeway to experiment in that space. Are the concerns around privacy any broader than what has been spelled out in the spec? (Specifically in 11.1 and 11.5) Could we call out in a spec note that the requirements on the reporting cache specifically allow for a per-document or per-origin cache, that it may persist only in RAM, or may be destroyed along with the document, and that its only purpose is to enable batching of reports as the user agent sees fit, and to increase reliability by allowing failed deliveries to be retried? |
I think it's problematic if we have a shared feature that works wildly different across user agents. Given the existing market dynamics that will result in a set of bug reports against Firefox and Safari, basically (at best). Also, when I look at the complete feature set and what we'd like the difference is quite stark. The way I view it is that a document should be able to declare zero or more "report endpoints", each with a "name" and "url". I'd much rather support this via a new header that can use the structured value syntax (as likely recommended by the HTTP WG) than something custom. |
@annevk do you have any suggestions for how we could accomplish what you've described while still enabling NEL + crash reporting to exist, without having to rebuild their own version of Reporting? |
I'm not sure why crashes could not be per-document. We have been thinking about OOM reporting at least and we don't think we need any kind of origin-wide store for that. We can use the header that came with the response that created the agent cluster that OOM'd. For NEL I'm not sure as it requires addressing the tracking issue somehow (perhaps restrict to top-level + cookie lifetime), but I don't see why having separate registration syntax is prohibitive. It can reuse all of the other infrastructure as far as I can tell. |
Re, crashes: yeah, I think you're right. I'm trying to remember why I originally included that in the set, but the reason escapes me at the moment.
Hmm, interesting. Re, cooking lifetime: there might be an interesting parallel here to where we landed for CH based on discussions at IETF 109 (see httpwg/http-extensions#878), where we opted to drop explicit lifetime in favor of implicit opt-in registration and persistence. Perhaps there is a similar (same?) argument to be made here? /cc @yoavweiss @martinthomson Re, top-level: can you elaborate on this one a bit more?
Well, the point of Reporting was to eliminate the need for different registration syntax between features, so there's that. However, I think the larger issue here is related to failover, retries, etc. In your original comment you noted that it's not necessary for v1 of API, but that is functionality that (at least when NEL was designed and as informed by "domain reliability" implementation experience in Chrome) is necessary for NEL.. I think that's the one we need to unpack and come to consensus on. |
I've personally come to the conclusion that lifetimes on persistence no longer need to be signaled by sites. Any expiration that the site needs to maintain can be opaque to the browser. Where there are space pressures (like cookies), the browser needs discretion on how to drop state, and for that expiration times don't really help. |
To be clear, I don't really know how to solve NEL, it seems about as hard as Origin Policy and we haven't landed on anything concrete there thus far. I'd love to unblock generalized per-document reporting (i.e., beyond CSP) and the general reporting format though as there are many features that could make use of that in a privacy-preserving manner. |
I certainly don't see a good way to accomplish the goals of NEL without going to something like Origin Policy (which would at least make all of the potential problems explicit ones) If we were to use this spec to define a common infrastructure for more ephemeral reporting, but leave the algorithms open for use by NEL, would that unblock you? (I'd like to keep the option open for batching and retrying out-of-band report delivery, but that's a different issue than the tracking concerns around NEL) |
Well, not really, see #158 (comment). But also, the nature of batching and out-of-band delivery can definitely have privacy implications that CSP's model does not have. |
Can you elaborate on this part? The report batching respects origin boundaries — each report upload will only ever contain reports about a single origin. (It has to, to be able to participate in the CORS preflight check.) And Chrome's implementation currently throws away all undelivered reports whenever the network settings change, for instance, so that the delay between report generation and delivery doesn't leak any information. There's a note about that here, but we could strengthen that language to make it a requirement instead of a recommendation. |
If you batch A nested in top-level B with top-level A you have some kind of tracking vector. If you deliver them sometime after the users closes all As you tell A something about when the user is using the browser. |
Can we at least batch per-document? If a single page view triggers several deprecation reports, for instance, it seems only positive for the browser to wait a minute, and then deliver them in a single HTTP request (gaining advantages from compression and fewer TCP round-trips) rather than delivering each in a single request as they occur. Slightly better, and hopefully still unproblematic would be batching reports from the same origin within an agent cluster -- if they can script each other, or even postMessage, then they can already infer any information about the user that a batch report delivery could provide. |
I think that's probably fine, with the latest possible delivery time matching Fetch's keepalive. |
As a concrete proposal for moving forward on this, we discussed the following steps at TPAC
|
I thought the suggestion was to pull out persistent storage of configuration into a separate doc for incubation, and later decide whether to merge back into Reporting (from the TPAC notes). This sounds like it's making persistent storage a thing only for the NEL spec? |
No, this is definitely not making persistence NEL-only. The building blocks for persistent reporting are likely to go into a couple of different places:
|
This change removes the concepts which are only relevant to persistent out-of-band reporting specs, such as Network Error Logging, and explicitly defines reporting mechansims for documents, where the generated reports are queued with the document itself, and thereby inherit its lifetime. Generic reporting concepts are extracted into another section so that they can be used by other specs. Addresses issues raised in #158.
I've split the reporting spec into two parts (#191); take a look at the pull request and feel free to comment there (or here) about the structure. Roughly, Reporting covers the core concepts, ReportingObserver, and what I've labeled "document-centered reporting", which was the original request in #158 (comment). The other spec is currently "Network Reporting", and extends Reporting by adding back the reporting cache, and defining different cache and delivery algorithms. To keep them separate, configuration for these reports is handled exclusively through Origin Policy. |
From #158 (comment) I see 1 and 3 in that PR (and I guess 4, did not look at that document). I don't really see batching described or 5. And wasn't failover also going to be an extension of sorts? |
5 was on a separate branch; I've published that as https://w3c.github.io/reporting/structured-headers/ (based on the branch at https://github.com/w3c/reporting/tree/structured-headers). I should probably merge those together, but didn't want to conflate all of the issues. You're right about failover -- I didn't pull that out yet. I'll work on that today, to see what the end result looks like. Restricted batching, I believe, is implied in this version by the fact that reports are owned by documents, and not pooled together across documents. |
Ah okay, but I thought we had agreed that a limited form of batching was agreeable? But yeah, maybe it's best to build that on top of this as defining that might be somewhat involved. |
I raised the question of removing failover / load balancing completely in #196; the most extreme version of that would seem to remove the need for everything other than a single URL to define an endpoint, but I want to get some feedback on whether that's going to far for existing use cases. I'd still like to be able to batch across same-origin documents, in the same agent cluster, for sure. It seems like a net efficiency win, without any real issues. Does that feel like the sort of thing that can be added on top? |
Same origin within an agent cluster, though maybe even same origin + same top-level origin within a user agent. |
Double-keying like that could work as well |
* Remove persistent non-document reporting from spec. This change removes the concepts which are only relevant to persistent out-of-band reporting specs, such as Network Error Logging, and explicitly defines reporting mechanisms for documents, where the generated reports are queued with the document itself, and thereby inherit its lifetime. Generic reporting concepts are extracted into another section so that they can be used by other specs. This removes endpoint weight and priority from the basic reporting spec, and creates a 'network reporting endpoint' concept for network reporting. Endpoint groups are similarly moved to network reporting, as there is no longer any need to group endpoints otherwise. Retries and failover are removed from basic reporting. Addresses issues raised in #158.
Structured headers is now in the main spec text as well, as of 7da9f80. I think that addresses all of the concerns raised in this issue. I'll look at adding some text re: batching within an agent cluster in a separate issue. |
Thanks for all your work here @clelland! Much appreciated! And indeed, let's close this and use more specific issues for further work. |
We'd like to support reporting beyond CSP's report-uri in Firefox, but are uncomfortable with the design of the Reporting API as-is. Our proposal is that for v1 of the feature its functionality is strictly equivalent to that of CSP's report-uri. More specifically:
This would enable Firefox to ship this functionality and start generating reports for various existing and new features (e.g., we've gotten feedback that reporting for Cross-Origin-Opener-Policy is rather essential).
The text was updated successfully, but these errors were encountered: