Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update regex to prevent ReDOS and add scanner #777

Closed
wants to merge 5 commits into from

Conversation

tjenkinson
Copy link

@tjenkinson tjenkinson commented Apr 13, 2024

Hey I noticed some regex in this tool and thought I'd scan it with my tool to see if there were any potential issues and it found some. So this PR fixes the vulnerable patterns and adds a check that runs on CI using eslint-plugin-redos-detector.

I initially thought on opening this as a security ticket, but given the main use case is running executables anyway and not random code it doesn't really feel like a vulnerability. Still makes sense to address though?

  • Tests pass
  • Appropriate changes to README are included in PR
  • Types updated

Comment on lines +195 to +196
const jsCodeBlock = /^(```+|~~~+)(?:js|javascript)$/
const shCodeBlock = /^(```+|~~~+)(?:sh|bash)$/
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thought it made sense to use a none capturing group here given we aren't using the value

@@ -192,9 +192,9 @@ function transformMarkdown(buf: Buffer) {
let state = 'root'
let codeBlockEnd = ''
let prevLineIsEmpty = true
const jsCodeBlock = /^(```+|~~~+)(js|javascript)$/
const shCodeBlock = /^(```+|~~~+)(sh|bash)$/
const otherCodeBlock = /^(```+|~~~+)(.*)$/
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem here is that . won't match \n and a few other characters without the s flag.

Try running

/^(```+|~~~+)(.*)$/.test('```' + '`'.repeat(1000000) + '\n')

in the console and it will lock up.

So the solution could be add the s flag, but actually given we weren't using the last group I think this makes more sense

@@ -90,9 +90,8 @@ const builtins = new Set([
'zlib',
])

const nameRe =
/^(?<name>(@[a-z0-9-~][a-z0-9-._~]*\/)?[a-z0-9-~][a-z0-9-._~]*)\/?.*$/i
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my tool doesn't support named groups (yet), so I switched it to a standard one, and then the issue was the .* at the end overlapping with the rest.

The simplest solution looks like removing that and the end anchor.

@@ -90,9 +90,8 @@ const builtins = new Set([
'zlib',
])

const nameRe =
/^(?<name>(@[a-z0-9-~][a-z0-9-._~]*\/)?[a-z0-9-~][a-z0-9-._~]*)\/?.*$/i
const versionRe = /^@(?<version>[~^]?(v?[\dx*]+([-.][\d*a-z-]+)*))/i
Copy link
Author

@tjenkinson tjenkinson Apr 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fix here makes the regex less precise, but I don't think that's a problem? I think it should be good enough to catch false positives?

The issue was in the second part.

Alternatively maybe we should use https://github.com/npm/node-semver to check if it's a valid version, but that would mean adding a dependency and not sure if it would be worth it?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is in the .gitignore but looks like it was accidentally checked in

@tjenkinson tjenkinson marked this pull request as ready for review April 13, 2024 10:03
@antongolub
Copy link
Collaborator

antongolub commented Apr 13, 2024

Not sure about the profit. I mean, the highlighted case is hypothetically possible, but what about the practice? For example, semver-regex should provide redos detection, it may accept an uncontrollable input, etc. But zx literally invokes bash commands, so regex performance is much less of a problem than $`rm -rf *` :-)

@tjenkinson
Copy link
Author

Not sure about the profit. I mean, the highlighted case is hypothetically possible, but what about practice? For example, semver-regex should provide redos detection, it may accept an uncontrollable input, etc. But zx literally invokes bash commands, so regex performance is much less of a problem than $`rm -rf *` :-)

yeh, agree, like I mentioned above it's not really a vulnerability here given the purpose of the project anyway. But I think a pattern that's not susceptible to a ReDOS and therefore has slightly better performance is still better than one that does 🤷.

This also simplifies the patterns a bit

"parser": "@typescript-eslint/parser",
"parserOptions": { "ecmaVersion": "latest" },
"rules": {
"redos-detector/no-unsafe-regex": ["error"],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are the advantages of this plugin compared, for example, with eslint-plugin-redos?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They take different approaches. With mine the goal was to never report a pattern as safe when it's not, so it may have false positives

Not sure if they have the same goal. With a fuzzing approach I think there is the potential to miss things but looks like sometimes it also uses an automation which maybe can't miss stuff

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for what it's worth, looks like that one doesn't handle back references as well

/(a)\1*$/

is a false positive with that one

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tjenkinson I'm the author of eslint-plugin-redos, and, I found this comment luckily ;)

/(a)\1*$/ seems a vulnerable regex to me and my detector also reports it as vulnerable, so what is it problem?
(If you feel that discussion here is not beneficial, please email to me or open a new issue to makenowjust-labs/recheck instead.)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @makenowjust !

/(a)\1*$/ is essentially /aa*$/ which is not vulnerable?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see maybe it is without the start anchor 🤔

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. /(a)\1*$/ is essentially /^.*aa*$/s, so it is vulnerable.

@tjenkinson
Copy link
Author

This run also failed but the job passed overall 🤔

https://github.com/google/zx/actions/runs/8672721202/job/23788065046

# tests 78
# pass 73
# fail 5

@tjenkinson
Copy link
Author

Might need to handle the unhandled rejection or add the cli flag for older node

https://nodejs.org/api/process.html#process_event_unhandledrejection

@antonmedv
Copy link
Collaborator

I don’t see why this is needed at this point. We are running bash scripts provided by user. So the code we run should already be trusted.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants