-
-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: Fix ReDoS vulnerability CVE-2021-35065 #49
Conversation
This is the performance test code from the reporter.
|
And the followings are the results on my machine. Before:
After:
|
Let's add a test! cc @RunDevelopment @Trott |
2e53333
to
da6f8af
Compare
da6f8af
to
723eed9
Compare
@phated I've added a test case for this! |
Oof, looks like I introduced this problem in 32f6d52. Sorry about that. This fix looks like an improvement to me, and the test results certainly seem to bear that out. But I don't know offhand if it is a partial fix or a complete fix. I'll probably go re-educate myself (yet again) on ReDoS but in the meantime, I wouldn't mind pulling in someone who actually knows about this stuff in detail, if he doesn't mind me tapping him on the shoulder for something that's probably sufficiently mundane that I should be able to figure it out myself. /ping @davisjam Does this change look like a complete fix for the ReDoS here or only a partial one? |
index.js
Outdated
@@ -6,7 +6,7 @@ var isWin32 = require('os').platform() === 'win32'; | |||
|
|||
var slash = '/'; | |||
var backslash = /\\/g; | |||
var enclosure = /[{[].*\/.*[}\]]$/; | |||
var enclosure = /[{[][^/\r\n\u2028\u2029]*\/.*[}\]]$/; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would this suffice (and be a fair bit easier to read as well)?
var enclosure = /[{[][^/\r\n\u2028\u2029]*\/.*[}\]]$/; | |
var enclosure = /[{[][^/]*\/.*[}\]]$/; |
If not, we should probably add a test case that shows it not working.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Trott The direct solution of this issue is the way as you suggested. But the reporter suggested this solution and we agree with it because we can't accept newlines in a filepath.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the decision is "we won't accept newlines in a path", then OK. But if the belief is "newlines cannot legitimately appear in a path", then I do not believe that is correct. See https://superuser.com/a/129532/81159. Or try this (which is basically the same thing as in that linked answer):
$ touch 'foo
quote> bar'
$ ls -lbd foo*
-rw-r--r-- 1 trott staff 0 Jun 24 07:04 foo\nbar
$
(The quote>
thing is zsh console output. Here's basically the same thing in Bourne shell instead.)
sh-3.2$ touch 'foo
> bar'
sh-3.2$ ls foo*
foo?bar
sh-3.2$ ls -ldb foo*
-rw-r--r-- 1 trott staff 0 Jun 24 07:08 foo\nbar
sh-3.2$
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a onetime file system tester, I'm with @Trott here: newlines are valid path elements in many file systems.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This regex is much improved in its performance. However. I did not check the context in which it is used, but if one of the "match a substring within this string" APIs is used then the performance can still be problematic.
Try for example:
let match = '{'.repeat(10*10*10*10*5);
console.log('matching');
var res = /[[{[][^/\r\n\u2028\u2029]*\/.*[}\]]$/.exec(match);
console.log('done matching')
console.log(res)
(This is slow in a Node v12 REPL I found on the web).
The slow performance is because:
- The "match anywhere" will basically try starting from each of
n
open-braces - There is a reachable quantifier that also matches a
{
(the first *'d group). - So the regex engine will match
O(n)
of them before rejecting each time -- that makes forO(n^2)
behavior.
So, food for thought:
- How long are the strings that this regex processes? (In the example above, I'm using a string with 50K characters in it. For shorter strings, the performance is "not great", but perhaps not catastrophic).
2.Are there any additional constraints on the input we could add in? (e.g. enforcing a maximum length, or additional properties that the regex can check) I think there is a possibly-illegible refactoring that would do the trick, but it would be nicer if there is a more maintainable option.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I looked at the project again and it seems like the regex /(^|[^\\])([{[]|\([^)]+$)/
has the same problem for all inputs "(".repeat(n)
.
Whatever solution is chosen for /[[{[][^/\r\n\u2028\u2029]*\/.*[}\]]$/
should also be applied to /(^|[^\\])([{[]|\([^)]+$)/
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I imagine that ideally, this wouldn't be doing regex matching at all but would instead use some module that does proper parsing for globs. Although I guess there really isn't a single globbing standard to adhere to. Anyway, if no one's looked yet, it might be worth looking at what's out there (to make sure they just don't do regexp matching under the hood).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another option is to use a linear-time regex engine. For example, you could add re2
as a (somewhat heavy, alas) dependency. It supports all of the features used in these regexes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Please @ me if you want further opinions)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@RunDevelopment I think that regex has no problem.
% cat test.js
var performance = require('perf_hooks').performance;
var globby = /(^|[^\\])([{[]|\([^)]+$)/;
var st = performance.now();
globby.test("{" + "/".repeat(500000));
var ed = performance.now();
console.log(ed - st);
% node test.js
0.7820440530776978
%
OK, brushed up a bit and this looks like a complete fix although this is based on a few minutes of reading and I really wouldn't mind a more knowledgable eye on this. |
I am examining. Edit: After coffee. |
test/index.test.js
Outdated
@@ -224,6 +226,26 @@ describe('glob2base test patterns', function () { | |||
|
|||
done(); | |||
}); | |||
|
|||
it('should not increase calc. time exponentially by \'/\' count [CVE-2021-35065]', function (done) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comment says "exponential". This regex will exhibit polynomial growth, not exponential growth.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it. I'll modify it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll remove this case and add a new case to check absolute time.
test/index.test.js
Outdated
|
||
[50000, 500000].forEach(function(n) { | ||
var result1 = measure(n); | ||
expect(result1 / result0).toBeLessThan(0.9); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Possibly you should also check for a "reasonable" absolute matching time, instead of just checking for a low-enough relative increase. For example, you could also assert that the 500K string finishes in under 1 second.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it. I'll add that assertion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll remove this case and add a new case to check absolute time.
@sttk I believe we should support newlines as are allowed within paths. I wasn't aware they could exist in a path! |
@phated I was just about to ask you it. I'm also considering about |
Mmm, -- |
I've modified the codes.
|
It might be worth considering getting rid of the regular expression altogether and replacing it with rudimentary parsing code like this: var lastChar = str.slice(-1);
var enclosureStart;
if (lastChar === '}') {
enclosureStart = '{';
}
if (lastChar === ']') {
enclosureStart = '[';
}
if (enclosureStart && str.includes(enclosureStart) &&
str.indexOf(slash) > str.indexOf(enclosureStart)) {
str += slash
} This code has some shortcomings, but (without thinking about it too hard) I don't think it has any that aren't already present in the regular expression. It even has some modest improvements, like ensuring that brace styles match. (The regexp will match on |
I rather liked the use of the performance API to check the performance difference, but sure, this works too. There is at least one more ReDoS lurking somewhere still, even with this change. I'll email you the string that causes it. |
@sttk I think this new RegExp is incorrect. It doesn't account for escaped brackets in a string, such as |
I'm wondering if we need to take @Trott's recommendation in #49 (comment) to match the start/end brackets correctly 🤔 |
I think there is no definitive solution for ReDoS and the best way is not to use RegExp for repeatable subexpression. |
@phated About escaped brackets, the original and new RegExp of
|
We received a report of ReDoS vulnerability and a suggestion to fix this.
I've sent this pr based on the report.