-
Notifications
You must be signed in to change notification settings - Fork 56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Increase the maximum number of dynamic rules to 30,000 #319
Comments
As noted in the meeting (https://github.com/w3c/webextensions/blob/main/_minutes/2022-12-08-wecg.md), this would also be a good time to separate the rule limits for session and dynamic rules. A potential downside of separating the quotas for session and dynamic rules is that extensions could try to maximize the number of rules by reading and installing session rules at startup. In my opinion, if there is a concern on the registering session rules at startup, then the more sensible thing to do is to have lower limits for session rules, than to continue maintaining a shared limit for session + dynamic rules. E.g. continue limiting session rules to 5k, and expanding dynamic session rules to 30k. |
@oliverdunk hi Oliver, we're waiting for some feedback from your team on this one, any chance you had an internal discussion? |
Hi Andrey 👋 Nothing specific to share yet, but we have been discussing this and I'm personally very keen to address it. If you're up for gathering some more data, one thing I would be interested in is the breakdown of the types of rules you'd be dynamically adding (how many are block rules, how many are redirects to extension URLs, how many are external redirects, how many are modifications etc). That would be really useful in figuring out what sort of usage to expect if we were to increase this limit :) |
Got it, we'll update the stats then, thank you! |
@oliverdunk hi Oliver, we collected needed data with categories. Filters were grouped by the same 3 groups: popular, recommended and others - they are represented on three pictures, one picture for one group of filters. Added network rules divided into 6 categories:
The top chart shows all categories of rules, the bottom chart shows everything except All these categories were extracted from DNR RuleActionType and from our process of conversion. You can see that block rules take up about 98-99% of the changes. |
@105th I forget if I said already but thank you so much for this! We get a lot from all of the data you share and really appreciate it - will try to have some follow-up on this for the next meeting :) |
We've been discussing this request for some time on the Chrome team and have put together a new proposal: Proposal: Safe rules in declarativeNetRequest In short, the idea is to establish a smaller subset of rule types which are subject to an increased dynamic rule limit. There is a lot more detail in the document, so please do take a look. Importantly, there is no change to what you can do with rules in declarativeNetRequest or the existing limits - this is purely additive. Feedback is very much welcome - in particular, the "safe rules" naming is something we'd like to discuss (no rule is definitively safe, just as all rules have valid use cases). We also want to make sure this addresses the most important use cases and fits in to the bigger picture of how the declarativeNetRequest API is used (for example, I know supporting differential updates is a key goal that may require more than one moving piece to fully support). Happy to discuss here for now although we could consider opening a new issue if that would work better. |
@oliverdunk Your proposal document currently proposes a higher limit for "block", "allow" and "allowAllRequests". Can it be just "allow" and "block", without "allowAllRequests"? "allowAllRequests" is expensive in its design and it is not clear whether the request here actually needs it. |
Assuming there aren't any use cases we're missing, I don't have any strong feelings there. I added it for completeness but "block" and "allow" are definitely the important ones. |
Just remove the limit on the number of rules. There is of course no need to have a limit and an arbitrary magic number is obviously always wrong since there is no way to prove that the choice is optimal or correct. The limit is suspected to have been added by Google for anti-user purposes (enriching their ad business at the expense of the user by making ad blockers less effective), or was possibly added due to sheer incompetence. If you want to prevent DoS by high memory usage, then account the rules as memory usage by the extension, and act on the memory usage of the whole extensions. Obviously this must not be an arbitrary limit, but instead some way of informing the user that the extension might be using too much memory and might be causing performance or crash issues. (in case it's not obvious, with a proper implementation (i.e. a trie for simple patterns or NFA for regexes) performance does not depend asymptotically on the number of rules, but only on the size of the text to match, excluding the construction of the data structure which is irrelevant since it can be done in the background before the extension update is finalized and saved to disk) |
First of all, it's awesome that you agree to increase the dynamic limit, thank you! Generally, the proposal covers what was requested pretty well. The concept of "unsafe rules" seems interesting to me, maybe it can be used further in the future (optional permission?). Just one question: why |
Thanks for taking a look at this! I don't think there's any specific reason to exclude The main reason to focus on |
|
To give some context on I wrote up some notes on the MV3 implementation for this feature, in case it is of interest: https://sammacbeth.eu/mv3-smarter-encryption |
Has any consideration been given to letting the user choose limits for each extension? It seems like limiting the maximum rule count for safety reasons doesn't make any sense. All it would take is one malicious rule to do damage. Maybe the user can take on the accepted (theoretical) risk of large dynamic lists by choosing to enable and/or set a max value. This would allow users who want to retain maximum functionality to eliminate the limits entirely, while the concern google et.al. have of legal risk for "approved" extensions goes away. Alternatively just slap a warning on the extension page that says the filters are dynamic and may result in unexpected behavior. We see this today for things like video games where a single player campaign is rated, but the online play is not because its out of control of the developer. TBH this whole thing seems pretty silly since the entire concept of a web browser is that you're going to view totally dynamic, unknown content every time you load a page. I mean right now you could go to a search engine and get an ad that leads to a scam. Or maybe you load a page and a tracker builds a profile of you that it sells to advertisers/scammers. Or maybe there's a vulnerability in the browser and a malicious website uses it to gain elevated access to your computer. These are all risks we accept with the web and I don't think anyone from w3c or any of the browser vendors would ever consider tackling those issues in the form of a pre-approved browser whitelist. So why are we doing that here with MV3? Let the user decide what risk they're willing to take. |
While the idea of allowing users to set their own limits for each extension is interesting, it introduces complexities, especially in terms of performance. The increase of the content blocker extension rule limit in WebKit to 150,000 was a balance between functionality and memory usage — see the linked commit for the juicy details. (That WebKit limit is also separate from Safari's dNR limits, for other technical reasons.) Allowing users to choose their own limits could lead to performance issues in a way that might not be immediately obvious to them. This is crucial, considering the diverse range of devices and operating systems browsers support. Performance degradation due to high rule counts might be hard for users to diagnose, leading to a suboptimal browsing experience. |
As a ublock origin user I have never had an issue with performance related to its dynamic lists. Are these performance issues a new problem resulting from the design adopted for MV3? If so perhaps that design should be reconsidered or abandoned. If this is not an issue of the MV3 design and the same issue exists in MV2 then why have users not noticed it yet? Is it maybe because the performance hit of the blocker's lists is nothing compared to the performance hit of running the unblocked javascript of ads? I think most users would accept the insignificant performance hit of unlimited dynamic lists if it eliminates the extremely noticeable and unbearable performance cost of unblocked ads. It seems like it would be better to let the user make that choice. If the issue is low end devices not handling the lists, then maybe those devices should have limits set based on available resources. Setting a static limit for all extensions across all hardware to support the lowest common denominator seems like not just a bad choice but also unnecessary. Browsers could automatically size lists to resources. And even if there is an issue with low end devices we run into the same tradeoff: blocking the javascript of the ads is almost always going to be a vast performance improvement. Effective ad blockers are even more critical for low end devices as any ads that sneak through will be detrimental to the user's experience. |
The keyword here being could. Surely some diagnostics to alert the user if an extension is using a substantial amount of memory is a rather nice trade-off? Looking at those numbers, I really don't see any issue other than for potentially low-end mobile devices. Whereas high-end devices will barely notice a few hundreds of MB being used. Why resort to this kind of collective punishment only to satisfy low-end devices? I wholly agree with the other statements that some of the content that could otherwise be blocked but wouldn't, because of these imposed limits could have a far bigger performance impact. While I like to try and keep things on-topic and professional, this rather obviously isn't decided on with the end-user in interest. That's but a façade, what's truly in interest here is Google and that alone. I mean, W3C's own guidelines champion that the user should be able to render the content the way they want - including assistive extensions and blocking of unwanted content. Why deny the user the ability to tweak these values themselves, if it means they could do just that? The fact the limit was set to a shockingly low value of 5,000 just further cements that Google has no end-user interest in this. Even if increased to 30,000, that's laughably low when like you demonstrated - WebKit allows for 150,000. I hate to sound a bit pessimistic, but I feel there's going to be some real Streisand effect fallout from all of this if it goes through the way Google wants it to. |
FYI Firefox hasn't bumped the limit yet, it is tracked as part of https://bugzilla.mozilla.org/show_bug.cgi?id=1803370. |
This has been fixed in WebKit, not shipping in Safari yet. |
I would like to add something VERY important to me here: I'm redoing my extension because we all need to migrate to v3, otherwise V2 will disappear from chrome webstore extensions and others too. I'm disgusted, angry and extremely nervous, because I learn that Google has set a limit of 5000/30000? my extension blocks md5-encrypted urls txt file hosted on a server, and as I've already explained, I have a lot of them, and this database is updated according to my needs, and the needs of those who use my extension. Is there a quick solution? my extension fetch a file from an url, that file is encrypted md5 urls, 1m of them. CHORME.MV3.TEST.25.VID_20240924_044404.mp4No message error, nothing in the console... |
In the case of ad blockers, dynamic rules can serve several purposes:
Differential updates
The limit of 5,000 is simply not enough for this purpose. The problem is pretty much the same as with the current “disableRules” limit. Here are some graphs that show how many rules are being added to popular filter lists that are used by AdGuard users.
Raw data
On average about 2600+ new rules are added every week to the most popular filter lists. Note, that this does not include changes to regional lists (many users have 1-3 regional lists). Considering the release cycle of 3-6 weeks, we need to increase the dynamic rules limit to at least 20,000 just to make it possible to use it for differential updates.
Custom filter lists
Ad blockers provide users with the possibility to add custom filter lists, the ones that are not included into the list of static rulesets. This is crucial for the ecosystem, that’s how new filter lists appear and become popular and eventually get included into the static rulesets. Is 5,000 enough for this purpose?
We analyzed public issues from AdGuard users from the AdguardFilters repo to see what sorts of custom lists are used and how many rules are there on average. It appears that about 5% of the reports come from the users that have custom lists. And every one in four of those users uses more than the maximum limit of 5,000 rules.
Custom user-made rules
Unfortunately, there’s no open stats that we can share on that. Although, from our experience, users rarely have over several hundreds of user-made rules.
Summary
Considering all the cases that are listed above we suppose that the adequate limit of dynamic rules is 30,000.
The text was updated successfully, but these errors were encountered: