You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I checked to make sure that this issue has not already been filed;
Problem description
iOS 17 has a very nasty bug which is still not fixed fixed in the latest update: https://discussions.apple.com/thread/255240753
This bug effectively limits the size of a content blocker JSON file to 10MB.
Since recently AdGuard filters cannot fit into this limit and the main reason for that is the way domain.* wildcards are handled when converting the rules. Currently, they're just converted to a list using top 200 most popular TLD.
Here's how a simple rule looks like after conversion:
I suggest transforming .* rules when the filters are compiled in the FiltersRegistry. We should transform domain.* wildcards to a shorter list of domains that are actually alive and active and thus compress the content blocker a lot.
It should be done for these three platforms:
iOS (obvious)
Desktop Safari (it does not suffer from the same bug, but I think it's still a good idea)
MV3 (does not have support for domain.* modifiers, what I propose may be useful to it as well).
Brief explanation
Extract domain wildcards from the filter list, i.e. extract domain.*.
Use the most popular TLD list to construct multiple domain names, i.e. domain.com, domain.cn, etc.
Check which of these domains are alive.
Replace the original domain with what you got in result. Don't forget to handle the case when the resulting rule does not have any domains at all in the end.
However, I suggest splitting this process in two parts so that we could make it manageable and more error-prone.
What TLDs to use
I suggest keeping a list of popular TLDs as a separate file that can be modified by filters maintainers in the future.
For starters, we can use the same 200 TLDs that are used in SafariConverterLib.
Part 1: Domains map
Let's build a dictionary that we will use later in the compilation process.
Input: all compiled filters in the filters registry.
Input: a list of the most popular TLDs.
Output: a JSON file that maps every domain wildcard found in the filters to a list of actual domains.
{
"google.*": [ "google.com", "google.com.uk" ]
}
We need to do this:
Go through every filter in FiltersRegistry
Extract all domain wildcards from it (use AGTree for working with filters there)
Compose a list of domain names from the wildcard and the list of the most popular TLDs.
Check which of them are alive. For checking which domains are alive or not I suggest using the same approach as we use in DeadDomainsLinter, i.e. use the urlfilter service.
Save the resulting map to a file.
Important: it may happen that the domain does not have any alive domains after checking. They need to be removed from the compiled list so print a warning to the console and add them with an empty array, i.e. "example.*": []
Part 2: Filters post-processing
Go through the platform filters.
Check every domain wildcard and check if we have a mapping for it in the file prepared during part 1.
If we do, replace the domain wildcard with the list of domains that we got from the mapping file.
It may happen that the rule becomes redundant after the changes are made, it needs to be removed in this case. Relevant parts of code in DDL: cosmetic, network.
Additional information
In order to verify the result, use the command-line version of SafariConverterLib to run the conversion and compare the output size.
The text was updated successfully, but these errors were encountered:
Prerequisites
Problem description
iOS 17 has a very nasty bug which is
still not fixedfixed in the latest update: https://discussions.apple.com/thread/255240753This bug effectively limits the size of a content blocker JSON file to 10MB.
Since recently AdGuard filters cannot fit into this limit and the main reason for that is the way
domain.*
wildcards are handled when converting the rules. Currently, they're just converted to a list using top 200 most popular TLD.Here's how a simple rule looks like after conversion:
Original:
docviewer.yandex.*##.js-doc-html > div[class^=\"pages_\"] > div[class*=\" \"]:empty
Converted:
Proposed solution
I suggest transforming
.*
rules when the filters are compiled in the FiltersRegistry. We should transformdomain.*
wildcards to a shorter list of domains that are actually alive and active and thus compress the content blocker a lot.It should be done for these three platforms:
domain.*
modifiers, what I propose may be useful to it as well).Brief explanation
domain.*
.domain.com
,domain.cn
, etc.However, I suggest splitting this process in two parts so that we could make it manageable and more error-prone.
What TLDs to use
I suggest keeping a list of popular TLDs as a separate file that can be modified by filters maintainers in the future.
For starters, we can use the same 200 TLDs that are used in SafariConverterLib.
Part 1: Domains map
Let's build a dictionary that we will use later in the compilation process.
Input: all compiled filters in the filters registry.
Input: a list of the most popular TLDs.
Output: a JSON file that maps every domain wildcard found in the filters to a list of actual domains.
We need to do this:
Important: it may happen that the domain does not have any alive domains after checking. They need to be removed from the compiled list so print a warning to the console and add them with an empty array, i.e.
"example.*": []
Part 2: Filters post-processing
Additional information
In order to verify the result, use the command-line version of SafariConverterLib to run the conversion and compare the output size.
The text was updated successfully, but these errors were encountered: