Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Fix ReDoS vulnerability CVE-2021-35065 #49

Merged
merged 4 commits into from
Jul 20, 2021

Conversation

sttk
Copy link
Contributor

@sttk sttk commented Jun 24, 2021

We received a report of ReDoS vulnerability and a suggestion to fix this.
I've sent this pr based on the report.

@sttk
Copy link
Contributor Author

sttk commented Jun 24, 2021

This is the performance test code from the reporter.

const { performance } = require("perf_hooks")
const globParent = require("glob-parent")

const measure = fn => {
  let s = performance.now();
  fn();
  return performance.now() - s;
};

for (const n of [5_000, 10_000, 20_000, 40_000, 80_000, 160_000]) {
  const input = "{" + "/".repeat(n);
  const time = measure(() => globParent(input));

  // if the runtime is O(n^2), then this ration will be roughly constant
  const ration = time / (n * n);

  console.log(`${n}:\t${time}ms\t${ration.toExponential(3)}`);
}

@sttk
Copy link
Contributor Author

sttk commented Jun 24, 2021

And the followings are the results on my machine.

Before:

% node -v
v14.15.0
% npm -v
6.14.8
% node perf_test.js 
5000: 32.4567289352417ms  1.298e-6
10000:  103.04731607437134ms  1.030e-6
20000:  382.97598814964294ms  9.574e-7
40000:  1525.007022857666ms 9.531e-7
80000:  6095.382986068726ms 9.524e-7
160000: 24153.12740111351ms 9.435e-7
%

After:

% node -v
v14.15.0
% npm -v
6.14.8
% node perf_test.js 
5000: 0.7545621395111084ms  3.018e-8
10000:  5.009187936782837ms 5.009e-8
20000:  2.1224639415740967ms  5.306e-9
40000:  0.5203690528869629ms  3.252e-10
80000:  1.0246939659118652ms  1.601e-10
160000: 2.1865968704223633ms  8.541e-11

@phated
Copy link
Member

phated commented Jun 24, 2021

Let's add a test! cc @RunDevelopment @Trott

@sttk
Copy link
Contributor Author

sttk commented Jun 24, 2021

@phated I've added a test case for this!

@Trott
Copy link
Contributor

Trott commented Jun 24, 2021

Oof, looks like I introduced this problem in 32f6d52. Sorry about that.

This fix looks like an improvement to me, and the test results certainly seem to bear that out. But I don't know offhand if it is a partial fix or a complete fix. I'll probably go re-educate myself (yet again) on ReDoS but in the meantime, I wouldn't mind pulling in someone who actually knows about this stuff in detail, if he doesn't mind me tapping him on the shoulder for something that's probably sufficiently mundane that I should be able to figure it out myself. /ping @davisjam Does this change look like a complete fix for the ReDoS here or only a partial one?

index.js Outdated
@@ -6,7 +6,7 @@ var isWin32 = require('os').platform() === 'win32';

var slash = '/';
var backslash = /\\/g;
var enclosure = /[{[].*\/.*[}\]]$/;
var enclosure = /[{[][^/\r\n\u2028\u2029]*\/.*[}\]]$/;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would this suffice (and be a fair bit easier to read as well)?

Suggested change
var enclosure = /[{[][^/\r\n\u2028\u2029]*\/.*[}\]]$/;
var enclosure = /[{[][^/]*\/.*[}\]]$/;

If not, we should probably add a test case that shows it not working.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Trott The direct solution of this issue is the way as you suggested. But the reporter suggested this solution and we agree with it because we can't accept newlines in a filepath.

Copy link
Contributor

@Trott Trott Jun 24, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the decision is "we won't accept newlines in a path", then OK. But if the belief is "newlines cannot legitimately appear in a path", then I do not believe that is correct. See https://superuser.com/a/129532/81159. Or try this (which is basically the same thing as in that linked answer):

$ touch 'foo
quote> bar'
$ ls -lbd foo*
-rw-r--r--  1 trott  staff  0 Jun 24 07:04 foo\nbar
$ 

(The quote> thing is zsh console output. Here's basically the same thing in Bourne shell instead.)

sh-3.2$ touch 'foo
> bar'
sh-3.2$ ls foo*
foo?bar
sh-3.2$ ls -ldb foo*
-rw-r--r--  1 trott  staff  0 Jun 24 07:08 foo\nbar
sh-3.2$

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a onetime file system tester, I'm with @Trott here: newlines are valid path elements in many file systems.

Copy link

@davisjam davisjam Jun 24, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This regex is much improved in its performance. However. I did not check the context in which it is used, but if one of the "match a substring within this string" APIs is used then the performance can still be problematic.

Try for example:

let match = '{'.repeat(10*10*10*10*5);

console.log('matching');
var res = /[[{[][^/\r\n\u2028\u2029]*\/.*[}\]]$/.exec(match);
console.log('done matching')
console.log(res)

(This is slow in a Node v12 REPL I found on the web).

The slow performance is because:

  1. The "match anywhere" will basically try starting from each of n open-braces
  2. There is a reachable quantifier that also matches a { (the first *'d group).
  3. So the regex engine will match O(n) of them before rejecting each time -- that makes for O(n^2) behavior.

So, food for thought:

  1. How long are the strings that this regex processes? (In the example above, I'm using a string with 50K characters in it. For shorter strings, the performance is "not great", but perhaps not catastrophic).
    2.Are there any additional constraints on the input we could add in? (e.g. enforcing a maximum length, or additional properties that the regex can check) I think there is a possibly-illegible refactoring that would do the trick, but it would be nicer if there is a more maintainable option.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked at the project again and it seems like the regex /(^|[^\\])([{[]|\([^)]+$)/ has the same problem for all inputs "(".repeat(n).

Whatever solution is chosen for /[[{[][^/\r\n\u2028\u2029]*\/.*[}\]]$/ should also be applied to /(^|[^\\])([{[]|\([^)]+$)/.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I imagine that ideally, this wouldn't be doing regex matching at all but would instead use some module that does proper parsing for globs. Although I guess there really isn't a single globbing standard to adhere to. Anyway, if no one's looked yet, it might be worth looking at what's out there (to make sure they just don't do regexp matching under the hood).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another option is to use a linear-time regex engine. For example, you could add re2 as a (somewhat heavy, alas) dependency. It supports all of the features used in these regexes.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Please @ me if you want further opinions)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@RunDevelopment I think that regex has no problem.

% cat test.js
var performance = require('perf_hooks').performance;
var globby = /(^|[^\\])([{[]|\([^)]+$)/;
var st = performance.now();
globby.test("{" + "/".repeat(500000));
var ed = performance.now();                                                     
console.log(ed - st);
% node test.js
0.7820440530776978
%

@Trott
Copy link
Contributor

Trott commented Jun 24, 2021

OK, brushed up a bit and this looks like a complete fix although this is based on a few minutes of reading and I really wouldn't mind a more knowledgable eye on this.

@davisjam
Copy link

davisjam commented Jun 24, 2021

I am examining.

Edit: After coffee.

@@ -224,6 +226,26 @@ describe('glob2base test patterns', function () {

done();
});

it('should not increase calc. time exponentially by \'/\' count [CVE-2021-35065]', function (done) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment says "exponential". This regex will exhibit polynomial growth, not exponential growth.

Copy link
Contributor Author

@sttk sttk Jun 26, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. I'll modify it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll remove this case and add a new case to check absolute time.


[50000, 500000].forEach(function(n) {
var result1 = measure(n);
expect(result1 / result0).toBeLessThan(0.9);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possibly you should also check for a "reasonable" absolute matching time, instead of just checking for a low-enough relative increase. For example, you could also assert that the 500K string finishes in under 1 second.

Copy link
Contributor Author

@sttk sttk Jun 26, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. I'll add that assertion.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll remove this case and add a new case to check absolute time.

@davisjam
Copy link

I left comments @Trott @sttk

@phated
Copy link
Member

phated commented Jun 26, 2021

@sttk I believe we should support newlines as are allowed within paths. I wasn't aware they could exist in a path!

@sttk
Copy link
Contributor Author

sttk commented Jun 26, 2021

@phated I was just about to ask you it. I'm also considering about the issue from @RunDevelopment above. '{'.repeat(n) and '['.repeat(n).

@sttk
Copy link
Contributor Author

sttk commented Jun 26, 2021

Mmm, /[{[][^/]*\/.*[}\]]$/.test('{'.repeat(500_000)) also takes a lot of time...

--
This is already mentioned in this comment.

@sttk
Copy link
Contributor Author

sttk commented Jun 26, 2021

I've modified the codes.

  • Modify the target regex to accept newlines and to avoid ReDoS by '{'.repeat(n) too.
  • Remove a test case to check relative time.
  • Add new test cases to check absolute time for '{' + '/'.repeat(n), '{'.repeat(n), and '('.repeat(n).

@Trott
Copy link
Contributor

Trott commented Jun 26, 2021

It might be worth considering getting rid of the regular expression altogether and replacing it with rudimentary parsing code like this:

 var lastChar = str.slice(-1);
 var enclosureStart;
 if (lastChar === '}') {
   enclosureStart = '{';
 }

 if (lastChar === ']') {
   enclosureStart = '[';
 }

 if (enclosureStart && str.includes(enclosureStart) &&
     str.indexOf(slash) > str.indexOf(enclosureStart)) {
       str += slash
 }

This code has some shortcomings, but (without thinking about it too hard) I don't think it has any that aren't already present in the regular expression. It even has some modest improvements, like ensuring that brace styles match. (The regexp will match on [/} for example, but this will not.)

@Trott
Copy link
Contributor

Trott commented Jun 26, 2021

I rather liked the use of the performance API to check the performance difference, but sure, this works too.

There is at least one more ReDoS lurking somewhere still, even with this change. I'll email you the string that causes it.

@phated
Copy link
Member

phated commented Jun 29, 2021

@sttk I think this new RegExp is incorrect. It doesn't account for escaped brackets in a string, such as "{\{abc\/}"

@phated
Copy link
Member

phated commented Jun 29, 2021

I'm wondering if we need to take @Trott's recommendation in #49 (comment) to match the start/end brackets correctly 🤔

@sttk
Copy link
Contributor Author

sttk commented Jul 3, 2021

I think there is no definitive solution for ReDoS and the best way is not to use RegExp for repeatable subexpression.
I agree to use @Trott's recommendation above.

@sttk
Copy link
Contributor Author

sttk commented Jul 3, 2021

@phated About escaped brackets, the original and new RegExp of enclosure return same result for your example.
Please tell me the exact specification for enclosure.

$ node
> var enclosureOrig = /[{[].*\/.*[}\]]$/;
undefined
> enclosureOrig.test("{\{abc\/}")
true
> var enclosureNew = /[{[][^/{[]*\/.*[}\]]$/;
undefined
> enclosureNew.test("{\{abc\/}")
true
>

@sttk
Copy link
Contributor Author

sttk commented Jul 3, 2021

I've modified the codes with reference to @Trott's code above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants