-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make regex and glob pattern matching stricter #130
base: master
Are you sure you want to change the base?
Conversation
Hi, Thanks for the contribution. I'll review it when I have time. Two comments I'd like to make beforehand are about
I think you're misunderstanding the feature. Normally, a pattern matches the whole URL e.g
That's how regexes work. Without the anchors, they match the whole string. |
const regex = host.substr(1); | ||
let regex = host.substr(1); | ||
if (matchDomainOnly) { | ||
// This might generate double ^^ characters, but that works anyway | ||
regex = "^" + regex + "$"; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As mentioned in the comments earlier, this is an unnecessary change based on the assumption that regexes should match from the beginning of the string.
.replace(/\*/g, '.*') | ||
.replace(/\?/g, '.?')) | ||
.test(toMatch); | ||
// Because the string is regex escaped, you must match \* to instead of * |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's either a word too many here or one missing.
let regex = host.substr(1); | ||
if (matchDomainOnly) { | ||
// This might generate double ^^ characters, but that works anyway | ||
regex = "^" + regex + "$"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a linting error. Please run npm lint
.
let prefix = matchDomainOnly ? 'should not' : 'should'; | ||
let description = `${prefix} match url with pattern only in path`; | ||
let description = `${pattern} ${prefix} match ${evilUrl}`; | ||
it(description, () => { | ||
expect( | ||
utils.matchesSavedMap( | ||
'https://google.com/?q=duckduckgo', | ||
evilUrl, | ||
matchDomainOnly, { | ||
host: simplePattern, | ||
host: pattern, | ||
}) | ||
).toBe(!matchDomainOnly); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This section's purpose is to make sure that matchDomain=true
really only uses the domain for matching and not the URL path contain the domain. When matchDomain=false
the url should match because the URL path contains the domain.
It looks like you repurposed the section to use evilUrl
which makes sense when you put the domain in the URL's path. However, it becomes a lot less clear in the other cases.
I'd recommend explicitly adding an it
for your test and reworking this part to add the domain of expectedUrl
to the URL path.
The main problem is that it is impossible to match the entire domain with a glob. Is that something you want to enable? I agree it's not required to make regexes behave this way, but I was trying to find any way to use globs like I expected to. |
You might need to explain how you got to that conclusion 🤔
https://developer.mozilla.org/en-US/docs/Web/API/URL/hostname const domain = url.hostname
url.hostname = domain This is what you mean when you say "domain", right?
Yes, I'm not disagreeing with your solution for globs. |
Any update to this? Only need to be merged, one year already past.. yet no update :/ |
@benyaminl think that the maintainer has been busy. I opened an issue to discuss support. Are you able to help? |
@mikedlr I already move to Tab Groups as it provide all function I need more than containerise, so I will pas this time. |
I was trying to use glob patterns and ran into one bug and one quirk that made them impossible to use correctly:
Regex metacharacters (such as
.
) were not escaped in globs (which use regexes for parsing internally):!*.example.com
matchedevilexample.com
!e.a.p.e.com
matchedexample.com
Patterns do not match the entire domain, even with "match domain only" enabled:
@example.com
matchedevil.example.com.evil.com
(user fixable with regex patterns:@^example.com$
behaves as expected, but I don't want this to be necessary)!example.com
matchedevil.example.com.evil.com
(no anchors in glob patterns, not fixable by the user)I believe it is more intuitive for patterns to always match the entire string when "match domain only" is enabled.
The first commit fixes the regex metacharacter bug and the second adds anchors when matching patterns with "match domain only" is enabled. The final commit updates the tests (not 100% if the logic is correct with the matchDomainOnly/should/should not testing).
The "match domain only" change is a breaking change. You can drop that commit if you want, but please document the current behavior. The glob example in the README matches many more domains more than just the specified domain.
A weaker change might be to add anchors only when matching glob patterns, not regexes. But note that this is a breaking change too: users might've been using
example.com
to match subdomains orwww.example.com
.If you want the patterns to behave like prefixes it's not sufficient to only add a right
$
anchor, becauseexample\.com$
still matchesevilexample.com
I'm not sure what the best solution here is.