Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Convert Bash scripts to Python #147 #338

Merged
merged 48 commits into from
Jul 12, 2018

Conversation

TheCodez
Copy link
Contributor

@TheCodez TheCodez commented Jun 11, 2018

I started the process of converting the Bash Scripts found in scripts to bash(#147)

All files compile without a problem, but there could be some runtime problems that I haven't found or tested yet.

The scripts should behave like their bash counterpart, as they are more or less a 1 to 1 translation that started off by using the Bash2Py tool and then fixing errors and applying some Python concepts

@TheCodez
Copy link
Contributor Author

What's the right way to run python 3 code? Running the install-share script with python3 seems to only work on Linux.

@silverbacknet
Copy link

Both are python.exe on Windows, and with a default install, that will get you the last one installed. The standard windows way is "py -2" or "py -3" (a shortcut which resides in the windows folder) when people haven't manually renamed it.

@silverbacknet
Copy link

silverbacknet commented Jun 16, 2018

Still not quite working, but a lot closer. Fixed most of the bugs, now I need to dig in and see why the backend isn't working right for me. The .log is actually a .diff, but you can't upload those here. If I keep working on this I'll just clone and submit a pull request.

@silverbacknet
Copy link

silverbacknet commented Jun 16, 2018

Reuploading to remove a debugging line I left in.
retdec-py2.log
(reupload again with minor changes)

@TheCodez
Copy link
Contributor Author

TheCodez commented Jun 16, 2018

@silverbacknet Thanks for the help. Some of your fixes I had already fixed locally. But I will integrate the rest of your changes as soon as possible.

@silverbacknet
Copy link

That's always the tough part about having a public repo. But I'm stoked it's getting there, I'll be so happy when MSYS isn't needed anymore!

Small fixes and cleanup
Early out if an error occurs
@TheCodez
Copy link
Contributor Author

So for some reason the generated json config file is a lot shorter than the one generated by the bash script. As a result, the backend.ll file is a lot shorter too, which should be the reason why llvmir2hll doesn't do anything.
I really don't know whats the cause for this, because all command calls seem correct.

@TheCodez TheCodez changed the title [WIP] Convert most Bash scripts to Python #147 [WIP] Convert Bash scripts to Python #147 Jun 17, 2018
Copy link
Contributor

@Maroc-OS Maroc-OS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here you forgot about adding ".py" files.


for file in os.listdir(path):
file_name = os.path.basename(file)
if file_name.startswith('retdec-tests-') and not file.endswith('.sh'):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here you forgot about adding ".py" files
if file_name.startswith('retdec-tests-') and not file.endswith('.sh') and not file.endswith('.py'):

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the help however I'm not quite sure if the code I wrote is even equivalent to what the original bash code intended to do:
find "$1" -name "retdec-tests-*" -type f $EXECUTABLE_FLAG | grep -v '\.sh$' | sort

Sadly there are also some other problems that prevent retdec_test_runner from working, because I'm currently running the executables incorrectly.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes because you added python files to the folder we should skip them when searching for tests.

@Maroc-OS
Copy link
Contributor

Maroc-OS commented Jul 3, 2018

i have used this and seems okay to me, i did not compiled the tests but tested manually putting some test files on macOS and it runs.

@TheCodez
Copy link
Contributor Author

Current status and things that need to be done:

  • update CIs to use the scripts, needs Python upgrade to atleast 3.5

  • retdec_archive_decompiler.py needs testing

  • retdec_decompiler.py worked in all my test cases, might need more testing. In some cases better than the bash script? (Windown 10: retdec-decompiler.sh - strange behavior #350)

  • retdec_unpacker.py finished. Is used successfully in decompiler script

  • retdec_fileinfo.py works for simple files. Haven't tested with archives

  • retdec_tests_runner.py doesn't really work, needs some work to be able to test in CI

  • cmake/install-share.py is finished and working.

  • yara_patterns/tools/compile-yara.py should work? Haven't tested.

  • type_extractor/gen_cstdlib_and_linux_jsons.py is far from being finished.

  • type_extractor/gen_windows_and_windrivers_jsons.py also needs help to finish.

@PeterMatula
Copy link
Collaborator

First of all, thanks, great work. 👍

I'm just looking into this and trying to run regression tests. The latest run (I hacked retdec-regression-tests-framework to use python implementation):
FAIL (failed 732 out of 5103 tests)
which is not too bad. However I don't think we test all the command line options and it will be very tricky to make sure every single option behave like it used to. But no matter, we can solve these problems later if somebody comes across them and creates an issue.

607 of those fails are related to retdec_fileinfo.py. The original script forwarded all the unknown arguments to retdec-fileinfo which it wraps. The new script does not. I'm currently rewriting it.

One question related to that: How/why did the shell=True get into this line: subprocess.call([config.FILEINFO, *fileinfo_params], shell=True)? It seems to me it is not working, and is not actually needed. Grep shows this happens on several places.

More important thing I'm thinking about is how to best collaborate on this. Other repositories (retdec-regression-tests-framework, retdec-idaplugin, ???) need to be modified. Should I (can I?) commit to this pull request, or should I merge this into a dedicated branch (e.g. bash-to-python) and continue the work on my own? I guess it depends if you (others) would like to continue working on this, or I should take over and finish it as soon as possible. It is hard to say how much work is left. It looks promising to me, but who knows what we come accross.

@TheCodez
Copy link
Contributor Author

The original script forwarded all the unknown arguments to retdec-fileinfo which it wraps.

I noticed that in the beginning, but somehow lost track of it and implemented it without it.

How/why did the shell=True get into this line: subprocess.call([config.FILEINFO, *fileinfo_params], shell=True)

That's still a relict from using Bash2Py. It uses the parameter for every external programm call. So I left it like this, as it seemed to work for me. The decompiler script uses the CmdRunner class, so this problem only exists in some of the other scripts.

Should I (can I?) commit to this pull request, or should I merge this into a dedicated branch (e.g. bash-> to-python) and continue the work on my own? I guess it depends if you (others) would like to
continue working on this, or I should take over and finish it as soon as possible.

I'm fine with both. The latter however seems like the better option, as I don't have that much time the next few weeks. Also I think what's still left converting needs to be done by someone who has a better understanding of bash. If I (or others) would like to do some changes, we can still apply a pull request to the dedicated branch.

@PeterMatula PeterMatula changed the base branch from master to bash-to-python July 12, 2018 14:34
@PeterMatula PeterMatula merged commit 4de4f49 into avast:bash-to-python Jul 12, 2018
@PeterMatula
Copy link
Collaborator

PeterMatula commented Jul 12, 2018

If anyone would like to play with this, I did the following:

  • Merged this into branch bash-to-python in the retdec repository.
  • Created branch bash-to-python in the retdec-regression-tests-framework repository. Tests in this branch are using Python implementation - cca. 500 tests are failing at the moment.
  • Created branch bash-to-python in the retdec-regression-tests repository. external.unit-tests.UnitTests needed to be changed to use Python script instead of Bash.
  • Created branch bash-to-python in the retdec-idaplugin repository. This will also need to be updated to use Python implementation.

I will try to finish this as soon as possible.

@Maroc-OS
Copy link
Contributor

Hello, i may help with this later am currently using it also as base because bash on macOS is not working.

retdec_decompiler.py does a great job but fail for selective decompilation. thats easy to be done

@TheCodez
Copy link
Contributor Author

TheCodez commented Jul 12, 2018

@Maroc-OS this can easily be fixed.
Replace the code starting at line 405:

if int(ranges[0]) > int(ranges[1]):

with:

start_range = int(ranges[0], 16 if ranges[0].startswith('0x') else 10)
end_range = int(ranges[1], 16 if ranges[1].startswith('0x') else 10)
if start_range > end_range:

This however only fixes the runtime error. The bash scripts produces different code, which means that the ranges are passed incorrectly to the underlying program.

[EDIT]
Fixed in #354

This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants