-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rate limiting #45
Comments
I'd be tentatively in favor of this, but I'd prefer a float (e.g. 0.0 is no
reports, 1.0 is all reports, 0.001 is 1 in 1000 reports).
…On Fri, Aug 4, 2017, 10:14 Dan ***@***.***> wrote:
Endpoint should be able to set rate limiting for reporting, e.g. by
setting a percentage 0-100. This would reduce number of duplicated reports
and would reduce a risk of DDoSing an endpoint when users of high traffic
site start sending reports.
My proposal:
Before sending a report browser will draw a random number from 1 to 100 if
the number is lower or equal than rate limit setting (0-100) it'll send a
report.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<https://github.com/WICG/reporting/issues/45>, or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAYojyMeALFJ_MbryncNDR4Im56Q5qPJks5sUycbgaJpZM4OtxSS>
.
|
@mikewest what is the upper end for number of reports that are triggered in CSP deployments youve seen? A lot of the conversation @RByers and I had around filtering and limiting are based on the idea that we are expecting low double digit number of reports. As a result, thinking about rate limiting may be premature. At least for now. Since deprecation, intervention, and crash reports are unlikely to trigger more than a handful on a page, the only thing I am worried about here is CSP reporting - thoughts? |
@ScottHelme or @arturjanc's team may have public numbers? I vaguely recall complaints inside Google when folks were first starting to roll out CSP that some apps were seeing reports in the same order of magnitude as their page views and had to start sampling manually (by leaving off I think this seems like a reasonable thing to add in general, but I don't feel like it needs to be part of a v1. |
+1. creating new |
As the recipient of billions of reports per month on https://report-uri.com I'm always interested in ways users can control the flow rate of reports but I have my reservations about a mechanism built into the report API like this. As @mikewest mentioned the best way to currently do this, and the way I recommend, is to inject the The 'Backoff' header was one way I'd have liked the reporting endpoint to be able to control this. It could be returned alongside a 429 for dynamic control of reporting volumes but isn't supported. Another way to do this would be to catch the Overall I'm not opposed to the idea suggested here or to the overall idea of a mechanism to rate limit reports, but unless it ends up widely supported by all browser it won't be very effective. That's why omitting the |
NEL includes a sampling rate like is discussed here (as of this patch). Two sampling rates, in fact, so that you can separately limit the reports about failures and successes. This mechanism has worked really well in NEL's predecessor, which is why we wanted to make sure it was in the NEL spec from the beginning, and a required part of a conforming implementation. In the WebPerf WG call today, someone asked whether NEL's sampling rate should be moved into the Reporting API, so other Reporting-dependent specs wouldn't have to reinvent the wheel. |
Opened up a related issue on the NEL side (nel#71). The fact that NEL lets you provide two sampling rates (one for successes, one for failures) would complicate adding sampling rates to Reporting. My preference would be to keep Reporting simple, and have a single sampling rate for each endpoint group. That means on the NEL side, instead of providing separate sampling rates for successes vs failures, you'd specify different Reporting endpoint groups for each. That's the simplest separation of concerns, though it does mean that you'd have to duplicate a lot of information across those two endpoint groups. My initial hunch is that that's still the right tradeoff, but I'd love to hear other opinions. |
@juliatuttle expressed concern about the size of the Right now, it sounds like there are three options on the table:
Does this sound like an accurate summary? Are there any strong opinions for or against any of the options? |
I believe option 2 is the best, as various endpoints may have different tolerances for rate limiting. |
Option 3 introduces strong coupling between Reporting and upstream specs: you have to know the name of the report type and relevant parameters that apply to it. In case of NEL you'll now need to provide NEL specific configuration to both the NEL and Report-To headers — this is confusing and something I'd love to avoid. Option 2: is based on the premise that sampling rate should be a first-class feature in Reporting. I don't have operational experience in this regard, but this does seem like a reasonable feature in light of experiences that Mike highlighted above, and since this is our chance to spec the behavior we expect from browsers.. might be worth it. On the other hand, I can also see a desire to have different sampling rates for each type of report, which hints to me that this should actually be an option associated to the report generator, not the downstream reporting endpoint? Specifying different endpoints to control sampling seems like a backwards way to go about it? For example, for NEL we can expose config options in the header and have NEL logic apply sampling before it calls out to Reporting API. Ditto for other API's; if CSP wants sampling, expose a CSP specific config option for it? |
I might have hand-waved too much in my description of option 3. I was thinking of it purely as a syntactic refactoring of option 2, so that you don't have to repeat the content of the URL list so many times. It's not that NEL and friends would decide which keys are allowed in It's the same with option 2: if you wanted separate rates, you'd have to create two endpoint groups. All of that said, I think I like option 1 the most the more I think of it. We'd add a fair bit of complexity to Reporting to factor out something that's not really that complex: configured rate is a double between 0 and 1, and for each report, roll a die to decide whether to report it or not. (The only wrinkle is that if NEL has the sampling rate, and doesn't send reports to Reporting if the die roll fails, then the ReportingObserver would never have a chance to see those reports.) |
The counter-argument to this being that if you're using a ReportingObserver, and want to see every report, then set your sampling rate to 1. And if needed, roll another die in your observer (or implement some more complex rate limiting) if you still want to limit what's uploaded to the collector. |
If different endpoints have different limits on what load they can handle, I think weights (proposed in #39) are a better approach. Those are per-endpoint. For rate limiting, the same sampling rate would apply to all of the endpoints in the group (for both option 2 and option 3). |
It's also not entirely unreasonable to say that ReportingObserver always sees all reports that are eligible for JS delivery, and are not subject to sampling. For example, we can add a flag to "queue the report step" that indicates whether it should be uploaded, and that can be used to implement sampling. On the other hand, we also need an inverse flag for whether report should be made visible to ReportingObserver, so there's some symmetry here. WDYT? |
I think it could be useful to have either option 1 (each spec defines a way to sample) or option 2 ( We like that NEL already has sampling built in. It seems to make sense for NEL to allow you to specify the sampling rate in its header directly during "registration", since you can't later apply a sampling rate at the time of the report (as NEL means the visitor can't reach your site). Should the other specs look towards implementing option 1? CSP has the potential for a large amount of reports-per-page and flooding (versus deprecations/interventions/crashes). Feature Policy Reporting also has some interesting things on the horizon with unoptimized-image related policies that could be noisy as well. Taking CSP as an example, from a usability point of view, I think it's easier for the website (or CDN) to include a For Feature Policy Reporting, not having a sampling rate is a little awkward as well, since you're either applying Feature Policies via the |
@nicjansma I like the idea of sampling per group (i.e option 2 you outlined above) as it allows us to abstract this common functionality across all the upstream report generators + enables site owners to group "noisy" generators into same policy buckets. |
Rate-limiting seems like a pretty cross-cutting concern, that building it into Reporting makes sense. As long as it doesn't stop other specs (such as NEL) from extending it if necessary, I'd support adding the syntax for option 2 here. Centralizing it also means that if we want to implement any further limits -- maybe a hard cap on total reports sent to an endpoint per page load -- that we would have an obvious place to make that change too. (I suspect that the sampling rates in NEL will just become a multiplicative factor in the final rate, if we go this route, but you could take advantage of that to say something like As an aside, Feature policy should probably adopt something like the |
I've come around to option 2, as well, since the comments I wrote way back upthread. I think there's a small tweak that we can make that makes it easy to support NEL's use case with over-complicating the other downstream specs:
So the easy case is "add Thoughts? |
I like the idea of making it a default, and allowing it to be overridden, as long as it's the developer overriding it, and not some less obvious spec behaviour. So, the default value would be 1.0, I'm presuming. And NEL would override that, rather than multiplying it. Would that parameter affect step 7 (notify reporting observers) or just step 5 (append to the report cache)? |
Endpoint should be able to set rate limiting for reporting, e.g. by setting a percentage 0-100. This would reduce number of duplicated reports and would reduce a risk of DDoSing an endpoint when users of high traffic site start sending reports.
My proposal:
Before sending a report browser will draw a random number from 1 to 100 if the number is lower or equal than rate limit setting (0-100) it'll send a report.
The text was updated successfully, but these errors were encountered: