Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: detect trojan source attack #95

Merged

Conversation

simone-sanfratello
Copy link
Contributor

@simone-sanfratello simone-sanfratello commented Nov 9, 2022

Following standard/standard#1742

This PR aims to add the trojan source attack detection to the plugin security, using the plugin originally authored by @lirantal https://github.com/lirantal/anti-trojan-source/

Co-authored-by: Luciano Mammino lucianomammino@gmail.com

@simone-sanfratello simone-sanfratello marked this pull request as ready for review November 9, 2022 09:22
@lmammino
Copy link
Contributor

Hello, any update on this one?

cc: @nlf

@nzakas
Copy link
Contributor

nzakas commented Nov 16, 2022

Sorry, just swamped right now. I need to dig back through the issue in the ESLint repo to see how this compares to what we discussed here: eslint/eslint#15240 (comment)

Copy link
Contributor

@nzakas nzakas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for putting this together. I left a bunch of comments.

I also noticed that this only will catch bidi characters in tokens or comments, but they can also occur in the white space of a file, and this rule won't catch those. Can you update the rule to catch those?

// Requirements
//-----------------------------------------------------------------------------

const { hasTrojanSource } = require('anti-trojan-source');
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like all this package does is search for particular character codes defined here:
https://github.com/lirantal/anti-trojan-source/blob/main/src/constants.js

Can we just copy those codes over and avoid an extra dependency (which also includes a CLI)?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nzakas WDYT about adding this package to devDependencies, so we can keep track of changes more easily (we'll dee them when reviewing new versions) and update our code with new characters if they added them?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just for context, we might have to update that particular function based on the outcome of the outstanding conversation point below

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MichaelDeBoey that doesn't seem necessary. Bidi characters already exist and are known, so I'm not sure why we'd need to keep track of updates anywhere else.

docs/anti-trojan-source.md Outdated Show resolved Hide resolved
rules/detect-trojan-source.js Outdated Show resolved Hide resolved
rules/detect-trojan-source.js Outdated Show resolved Hide resolved
rules/detect-trojan-source.js Outdated Show resolved Hide resolved
rules/detect-trojan-source.js Outdated Show resolved Hide resolved
rules/detect-trojan-source.js Outdated Show resolved Hide resolved
node.tokens.forEach((tokenObject) => {
if (tokenObject.value && hasTrojanSource({ sourceText: tokenObject.value })) {
context.report({
node: node,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In order to correctly report the location of the character, we need to manually pass in the loc, otherwise every report will have a location of line 1, column 0. See https://eslint.org/docs/latest/developer-guide/working-with-rules#contextreport.

Here's an example:
https://github.com/eslint/eslint/blob/main/lib/rules/no-tabs.js#L59-L72

You should also adds checks into the tests to verify that the location is correct.

Because these characters are invisible, having the correct location is really important.

Copy link
Contributor

@lmammino lmammino Dec 9, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We spent a bit of time working on this one and we realised that the original detect-trojan-source by @lirantal does not give you granularity on where exactly every single malicious unicode character is. On the contrary it just gives you a boolen telling if the given code token or comment block contains at least one of such characters.

Because of this, the location reporting is a bit vague and it just gives users a reference about the location of the affected token. This can be useful, but also misleading (say for example there's a large code block spanning multiple lines).

We changed the code slightly to report the location of the token (and updated tests accordingly).

We could improve this further by changing the hasTrojanSource function to return a list of malicious unicode characters and their offset in the token.

This would make the scan a bit more expensive though, so we are not too sure whether it is a worthwhile approach or not.

There's a third option that might be a decent tradeoff between accuracy and performance. Rather than returning a boolean hasTrojanSource could return the offset of the first occurrence of a malicious unicode character. This way, we will still have one single error per token, but we could at least point the user to the first occurrence of the hidden character. Once the user fixes that, it should be prompter to the next character (if there's more than one per token).

@nzakas what do you think? Do you have any preference?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like that third option which is sorta fail fast as a good balance for quickly running through the code, if it's costly. Also, would it make any sense to push these updates to the detect-trojan-source package, or is that at all not used here anyway?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Happy to do a PR with any relevant change to detect-trojan-source back to the original repo (as of now we did not change any code coming from there, just copied verbatim the hasTrojanSource and pasted it into the codebase here).

Thanks a lot for the suggestion of option 3. For now my favourite too :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lmammino I think you may be overthinking this a bit. In general, people don't like rules that gradually reveal errors -- they want to see them all up front. If it were me, I would expect warnings about every instance of a bidi character in the source code so I'd know how many issues we are talking about.

A good example to follow is the ESLint core rule no-tabs, which does essentially the same thing, only it's looking for tab characters instead of bidi characters.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Solved using a similar approach to no-tabs

rules/detect-trojan-source.js Outdated Show resolved Hide resolved
@nzakas
Copy link
Contributor

nzakas commented Dec 6, 2022

Just checking back to see if you intend to continue working on this?

@simone-sanfratello
Copy link
Contributor Author

Just checking back to see if you intend to continue working on this?

Yes, we are working on that - in our spare time

@lmammino
Copy link
Contributor

lmammino commented Dec 7, 2022

Just checking back to see if you intend to continue working on this?

Yes, thanks a lot for all the suggestions. We have allocated some time this week to go through them, so hopefully, we'll have an updated PR by the end of the week.

@lmammino
Copy link
Contributor

lmammino commented Dec 9, 2022

Hey @nzakas we took some time today to apply all the changes you suggested.

First of all, thank you very much for taking the time to do such an in-depth review. I certainly learned a lot thanks to it.

Secondly, we believe this is in a good state now. There's only a single discussion point that might be worth reconsidering and it's related to reporting the unicode character location.

Let us know what you think about that one so we can move this PR forward.

Thanks again

@nzakas
Copy link
Contributor

nzakas commented Dec 13, 2022

Thanks. I’ll take a look sometime this week.

Copy link
Contributor

@nzakas nzakas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks again for your work on this. I left a note inline, but also pulling out here: I think it would be better to model this rule after no-tabs in the ESLint core. It is basically the same algorithm only it's looking for tab characters instead of bidi characters.

At a higher level, I think a better name for the rule is detect-bidi-characters, because there may a legitimate reason to use these characters that isn't necessarily a trojan source attack. I also think "trojan source" is a bit of an opaque term that requires reading all of the documentation to understand vs. "bidi characters", which is instantly recognizable.

rules/detect-trojan-source.js Outdated Show resolved Hide resolved
@simone-sanfratello
Copy link
Contributor Author

Thanks again for your work on this. I left a note inline, but also pulling out here: I think it would be better to model this rule after no-tabs in the ESLint core. It is basically the same algorithm only it's looking for tab characters instead of bidi characters.

At a higher level, I think a better name for the rule is detect-bidi-characters, because there may a legitimate reason to use these characters that isn't necessarily a trojan source attack. I also think "trojan source" is a bit of an opaque term that requires reading all of the documentation to understand vs. "bidi characters", which is instantly recognizable.

Thanks @nzakas for the feedback, we've implemented in that way.
Please take a look at tests, they show what we want to achieve.

We're going to complete the PR with the other changes (name, jsdoc) once we agree on the solution.

@lmammino
Copy link
Contributor

Hello @nzakas, we believe we applied all the requested changes, including renaming the rule to detect-bidi-characters. Please let us know if you think this is good to go or if it needs more work.

Thanks for all the awesome suggestions so far!

Copy link
Contributor

@nzakas nzakas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great now. Thanks for being so open to feedback.

I just wanted to double check that you don’t have any other cleanup work you wanted to do before merging?

@nzakas
Copy link
Contributor

nzakas commented Dec 17, 2022

Please double check the lint errors.

@lmammino
Copy link
Contributor

Thanks @nzakas, we believe this is good to go.

PS: just fixed the markdown linting issues as well

@nzakas
Copy link
Contributor

nzakas commented Dec 30, 2022

Sorry, was away for the holidays. The linting CI is still broken. Can you take a look>

Copy link
Contributor

@lirantal lirantal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added suggestions to comply with CI linting issues

docs/rules/detect-bidi-characters.md Outdated Show resolved Hide resolved
docs/rules/detect-bidi-characters.md Outdated Show resolved Hide resolved
docs/rules/detect-bidi-characters.md Outdated Show resolved Hide resolved
docs/rules/detect-bidi-characters.md Outdated Show resolved Hide resolved
docs/rules/detect-bidi-characters.md Outdated Show resolved Hide resolved
docs/rules/detect-bidi-characters.md Outdated Show resolved Hide resolved
Co-authored-by: Liran Tal <liran.tal@gmail.com>
Copy link
Contributor

@nzakas nzakas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All right, this is good to go. Thanks so much for your hard work on this.

@nzakas nzakas merged commit 4294d29 into eslint-community:main Jan 2, 2023
@lmammino lmammino deleted the feat/anti-trojan-charset branch January 3, 2023 08:01
@lmammino
Copy link
Contributor

lmammino commented Jan 3, 2023

Thank you @nzakas, @simone-sanfratello and @lirantal 😊

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants