-
Notifications
You must be signed in to change notification settings - Fork 244
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
-r Recursive option #129
-r Recursive option #129
Conversation
pyt/__main__.py
Outdated
excluded_files = args.excluded_paths | ||
test = discover_files(directory_path, excluded_files) | ||
|
||
print(test) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just to see if it works
Hi @omergunal, thanks for the soup! :D Can you merge master into this branch and then resolve the merge conflicts? 👍 |
No problem :) , ok i will do it |
in usage.py "filepath" is must required. should we do optional? because if we use "-r" we do not need it |
Can you also add parser.add_argument(
'targets', metavar='targets', type=str, nargs='*',
help='source file(s) or directory(s) to be tested'
) |
Let's do this https://github.com/PyCQA/bandit/blob/master/bandit/cli/main.py#L153-L160 and then remove |
I realize I'm totally changing my mind from what I said before, "This will enable a user to just give -r /path/to/files instead of -f file one at a time." but this seems cleaner. |
i.e. |
You mean, we will delete "-f" option and use "-r" for both single file scan and directory scan.And if user will not use any parameter, it will scan one file. Is that correct? |
pyt/usage.py
Outdated
action='store', | ||
default='', | ||
help='Separate files with commas' | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
De-dent )
pyt/usage.py
Outdated
@@ -91,7 +91,18 @@ def _add_optional_group(parser): | |||
action='store_true', | |||
help='do not skip lines with # nosec comments' | |||
) | |||
|
|||
optional_group.add_argument( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe make it
parser.add_argument(
'-r', '--recursive', dest='recursive',
action='store_true', help='find and process files in subdirectories'
)
pyt/__main__.py
Outdated
@@ -30,6 +30,19 @@ | |||
) | |||
|
|||
|
|||
def discover_files(directory_path, excluded_files): | |||
file_list = [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: We mostly use list()
everywhere in assignments in the codebase, just for consistency.
pyt/__main__.py
Outdated
if os.path.splitext(fullpath)[1] == '.py' and fullpath.split("/")[-1] not in excluded_list: | ||
file_list.append(fullpath) | ||
|
||
return(file_list) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: just for consistency, you can do return included_files
(and I guess rename files
to included_files
)
re:"You mean, we will delete Delete def discover_files(targets, excluded_files, recursion):
included_files = list()
excluded_list = excluded_files.split(",")
for target in targets:
if target.endswith('.py'):
included_files.append(target)
else:
for root, dirs, files in os.walk(target):
for f in files:
fullpath = os.path.join(root, f)
if os.path.splitext(fullpath)[1] == '.py' and fullpath.split("/")[-1] not in excluded_list:
included_files.append(fullpath)
if not recursion:
break
return included_files |
(just updated the code, should be better now.) |
it seems good about returning "included_list". also "-x " parameter is available. |
pyt/__main__.py
Outdated
@@ -39,6 +57,14 @@ def main(command_line_args=sys.argv[1:]): | |||
elif args.trim_reassigned_in: | |||
ui_mode = UImode.TRIM | |||
|
|||
|
|||
|
|||
targets = args.targets |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it might be more DRY if you did
files = discover_files(
args.targets,
args.excluded_paths,
args.recursive
)
pyt/usage.py
Outdated
default='', | ||
help='Separate files with commas' | ||
) | ||
optional_group.add_argument( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess targets will be part of _add_required_group
b/c it's replacing -f files
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Almost there, looking good :D
pyt/__main__.py
Outdated
directory = os.path.dirname(path) | ||
project_modules = get_modules(directory) | ||
local_modules = get_directory_modules(directory) | ||
for path in files: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Before this for
loop, you can have a vulnerabilities = list()
, and then do
vulnerabilities.append(find_vulnerabilities(
cfg_list,
ui_mode,
args.blackbox_mapping_file,
args.trigger_word_file,
nosec_lines
))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it find vulnerabilities in all the files, or just the last file (i.e. last iteration of the loop)? If I'm reading it write I think it might do e.g. vulnerabilites = kitmap_vulns
...then finally vulnerabilities = a.py_vulns
, and only report the last list.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(As an aside, it seems strange it's not printing out the vulnerability info, and just seems to print the object.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right, it taking last item on the list. have you idea for fix?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the fix is having a list outside of the for loop and adding the vulnerabilities of each file to it. The code from my first comment should do it, although now that I think about it it's probably extend and not append.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
^Ahh, this was it! 👍 Just change append
to extend
and it'll all work! :D
pyt/__main__.py
Outdated
) | ||
|
||
if args.baseline: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can de-dent this, if args.baseline:
as only one call to get_vulnerabilities_not_in_baseline
, with the vulnerabilities of every file will work.
pyt/__main__.py
Outdated
) | ||
vulnerabilities = list() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, That's odd, I'll checkout/test your code later today to try and find the issue. 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's really weird, I'll look more in-depth on Monday 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I checked out your branch and it was append
vs. extend
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Super close, just the de-dent and append vs. extend and I think that's mostly it :)
pyt/__main__.py
Outdated
|
||
if args.baseline: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can keep this, but just de-dent it so that we only trim once.
So I looked at the tests that were failing and the You can do e.g. class MainTest(BaseTestCase):
+ @mock.patch('pyt.__main__.discover_files')
@mock.patch('pyt.__main__.parse_args')
@mock.patch('pyt.__main__.find_vulnerabilities')
@mock.patch('pyt.__main__.text')
- def test_text_output(self, mock_text, mock_find_vulnerabilities, mock_parse_args):
+ def test_text_output(self, mock_text, mock_find_vulnerabilities, mock_parse_args, mock_discover_files):
mock_find_vulnerabilities.return_value = 'stuff'
example_file = 'examples/vulnerable_code/inter_command_injection.py'
output_file = 'mocked_outfile'
+ mock_discover_files.return_value = [example_file]
mock_parse_args.return_value = mock.Mock(
autospec=True,
- filepath=example_file,
project_root=None,
baseline=None,
json=None, and the same for the other tests. This makes it so that in the tests, we don't really ever call |
) | ||
initialize_constraint_table(cfg_list) | ||
analyse(cfg_list) | ||
vulnerabilities.extend(find_vulnerabilities( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Look good to you?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, there are no vulnerability in a.py b.py and c.py but it printing from xss.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't figure out the bug yet, gonna look more tomorrow 😁 This is harder than expected to track down
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, i fixed it. its about vulnerabilities = list()
location
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you write tests for discover_files
when you get a chance? 👍
pyt/__main__.py
Outdated
included_files.append(fullpath) | ||
else: | ||
if target not in excluded_list: | ||
included_files.append(targets[0]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So if targets is a list of files, e.g. python -m pyt examples/vulnerable_code/command_injection.py examples/vulnerable_code/XSS.py
, then discover_files
will return the first file N times. (Where N is the len of targets.)
pyt/__main__.py
Outdated
|
||
for target in targets: | ||
if os.path.isdir(target): | ||
if recursive: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So having if recursive:
here it will make it so that if you don't have -r
then you won't search directories.
You can change it to:
def discover_files(targets, excluded_files, recursive=False):
included_files = list()
excluded_list = excluded_files.split(",")
for target in targets:
if os.path.isdir(target):
for root, dirs, files in os.walk(target):
for f in files:
fullpath = os.path.join(root, f)
if os.path.splitext(fullpath)[1] == '.py' and fullpath.split("/")[-1] not in excluded_list:
included_files.append(fullpath)
if not recursive:
break
else:
if target not in excluded_list:
included_files.append(target)
return included_files
pyt/__main__.py
Outdated
args.recursive | ||
) | ||
for path in files: | ||
vulnerabilities = list() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the bug I found yesterday, or more accurately the thing I don't understand 😕 , is that find_vulnerabilities
returns the vulnerabilities for all the files previously analyzed, as if find_vulnerabilities
knows all the vulnerabilities found for the other files we've looked at, how does it know this? 😱
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So as far as the PR, you can change it to what you had originally, i.e. vulnerabilities = find_vulnerabilities(..)
, sorry I misunderstood the code, I'll still look into the reason why the code does this though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aha, figured it out, so I knew constraint_table
etc. were global variables that keep state, and that they could be the culprit if they were used weirdly, however it is due to FrameworkAdaptor https://github.com/python-security/pyt/blob/master/pyt/web_frameworks/framework_adaptor.py#L88 adding all the past CFGs to the list :) I'll add a comment to __main__.py
about it after we merge this PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Almost there :)
pyt/__main__.py
Outdated
if os.path.isdir(target): | ||
for root, dirs, files in os.walk(target): | ||
for f in files: | ||
if not recursive: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's important for the if not recursive:
to be after the
fullpath = os.path.join(root, f)
if os.path.splitext(fullpath)[1] == '.py' and fullpath.split("/")[-1] not in excluded_list:
included_files.append(fullpath)
this is so that we only iterate through the for f in files:
once. i.e. just one-level of depth and not recursively.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, i did it
pyt/__main__.py
Outdated
vulnerabilities, | ||
args.baseline | ||
) | ||
vulnerabilities = get_vulnerabilities_not_in_baseline( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for de-denting the if args.baseline:
You can de-dent the vulnerabilities = get_vulnerabilities_not_in_baseline(
too, 👍
pyt/__main__.py
Outdated
|
||
for target in targets: | ||
if os.path.isdir(target): | ||
for root, dirs, files in os.walk(target): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this line, for root, dirs, files in os.walk(target):
is indented one more level than it has to be.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i will try to do better for returning "included_files"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, just make the tests pass and I'll merge 👍 (Feel free to write tests for discover_files
if you'd like to though.)
pyt/__main__.py
Outdated
excluded_list = excluded_files.split(",") | ||
for target in targets: | ||
if os.path.isdir(target): | ||
for root, dirs, files in os.walk(target): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: You can de-dent from line 38
to line 44
pyt/__main__.py
Outdated
|
||
if args.baseline: | ||
vulnerabilities = get_vulnerabilities_not_in_baseline( | ||
vulnerabilities, | ||
args.baseline | ||
) | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: You can delete this newline
and its done. i will write test for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So happy to merge 🎊 🎈 🎉 🎂
…ll versions, add travis commands to tox so this does not happen again
Issue: #127
There is a few steps for completing this PR. Now we can get all ".py" files in directory and exclude some files with "-x" option.