Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade to Capstone release 4.0.2 with patch #1086

Merged
merged 6 commits into from
Jul 21, 2022

Conversation

richardlford
Copy link
Contributor

This is the latest released version of Capstone and the first Capstone release to support the endbr32 and endbr64 instructions. I was trying to add some new tests for RetDec, but although I can use an option to avoid producing the endbr instructions, the library has them so they end up in the produced binary. Thus if we want to add any new tests (which I do), we need to update to a newer Capstone.

@richardlford
Copy link
Contributor Author

richardlford commented Jul 7, 2022

@PeterMatula, please review this change. This change causes around 50 test failures in the regression tests. I will be examining the changes to see if they are really regressions, or just differences, or possibly improvements. RetDec does build successfully with this Capstone. I had also experimented with some of the Capstone release candidates (5.0-rc1 and 5.0-rc2, as well as the head of the "next" branch, where current development is being done) but retdec had compilation errors using those versions.

@richardlford
Copy link
Contributor Author

@PeterMatula, as an alternative to doing a full upgrade to the 4.0.2 release, I experimented with forking capstone and just cherry-picking the endbr* support onto the version of capstone that RetDec is currently using (See richardlford/capstone@a6ee24f). I then modified RetDec to use that capstone and also added code to treat the endbr* instructions as NOPs (i.e. producing no LLVM). The result passes all of the RetDec regression tests. If you don't want to upgrade to the full 4.0.2 release of capstone you could consider having your own fork of capstone and apply the endbr* support change as I did. Of course, it seems preferable to try to keep up with capstone otherwise RetDec is likely to have continuing problems with newer binaries.

Thanks for your help.

Capstone version 4.0.2 has a bug when disassembling a powerpc instruction
with a signed 16-bit immediate.
See capstone-engine/capstone#1746 and
capstone-engine/capstone#1746 (comment).

This change adds to the capstone patch to fix this problem.
richardlford added a commit to richardlford/retdec-regression-tests that referenced this pull request Jul 17, 2022
Capstone version 4.0.2 has a bug when disassembling a powerpc instruction
with a signed 16-bit immediate.
See capstone-engine/capstone#1746 and
capstone-engine/capstone#1746 (comment).

RetDec PR avast/retdec#1086 moved RetDec up to
version 4.0.2 of Capstone, but adds a patch to fix this bug. The
fix includes not printing the immediates in hex (a later version
of Capstone that fixes this problem no long prints the immediates in hex).
This change fixes a test to agree with the new behavior.
@richardlford
Copy link
Contributor Author

richardlford commented Jul 17, 2022

@PeterMatula, I have pushed another commit that patches a bug in capstone 4.0.2. In addition to fixing the bug, the patch also now always prints the 16-bit immediates in decimal (this is the behavior in a later version of capstone that has this bug fixed). With this patch, all of the RetDec regression tests passed except one that was expecting the immediates to be printed in hex. I have submitted a separate PR (avast/retdec-regression-tests#118) to fix that test, so you will want to merge that PR after this one.

Could you please review it?

Thanks

richardlford added a commit to richardlford/retdec-regression-tests that referenced this pull request Jul 17, 2022
These test binaries contain endbr32 and endbr64 instructions,
so this PR should be committed after RetDec has moved
to Capstone 4.0.2 (avast/retdec#1086).
@richardlford richardlford changed the title Upgrade to Capstone release 4.0.2 Upgrade to Capstone release 4.0.2 with patch Jul 18, 2022
@PeterMatula
Copy link
Collaborator

Thanks for this, I will definitely look at it, as I planned to update Capstone for a long time but didn't get around to it myself. I'm not against updating Capstone even if it causes some tests to fail - we will either fix them if possible or disable them. It is not good to remain in a state of the old Capstone version just for the sake of tests.

Some time ago I also experimented with Capstone v5 and there is also #1059 PR, but so far I didn't manage to get the tests to some reasonable state with the new Capstone. I would prefer to use the latest Capstone possible, but I'm not sure what is the state of Capstone project, if the next will even be merged to master, what version are Capstone users supposed to use, and if it is worth the effort to move to v5 as it requires even more changes than 4.0.2. Do you have any opinion/knowledge on this?

@richardlford
Copy link
Contributor Author

richardlford commented Jul 19, 2022

Some time ago I also experimented with Capstone v5 and there is also #1059 PR, but so far I didn't manage to get the tests to some reasonable state with the new Capstone. I would prefer to use the latest Capstone possible, but I'm not sure what is the state of Capstone project, if the next will even be merged to master, what version are Capstone users supposed to use, and if it is worth the effort to move to v5 as it requires even more changes than 4.0.2. Do you have any opinion/knowledge on this?

@PeterMatula, the 4.0.2 version is the latest released version (i.e. with a github release), though there are tagged v5 release candidates. I tried various newer versions but was getting compile errors building RetDec with them. I think moving to 4.0.2 would be a good interim step toward newer versions. I had the impression that the next branch was going to be the permanent branch for development, perhaps to avoid the name master because of it social implications.

And the 4.0.2 version supports the endbr32 and endbr64 instructions. This is essential to handle some newer binaries. So I'd appreciate it if you would go ahead and merge this.

In studying Capstone's use of the llvm-mc infrastructure, it occurred to me that RetDec, with some effort, might be able to eliminate the use of Capstone and just use LLVM's decoder directly, i.e. it McInsts. But that would require further study. But LLVM is very active and more likely to have up-to-date instructions.

HoundThe and others added 2 commits July 20, 2022 15:52
Calls to dynamically-linked functions go through the procedure linkage
table (PLT).  RetDec turns a PLT entry into a function, say
malloc@plt, that appears to do nothing but call the external function,
say malloc (though the assembly code will do a jump rather than a
call). User code that logically wants to call malloc instead calls
malloc@plt (and sets up arguments as if calling malloc). The
malloc@plt code first jumps to the dynamic linker which modifies it so
that subsequent calls to malloc@plt will jump directly to malloc. We
say that malloc@plt wraps malloc.  The call to malloc in malloc@plt
will not have any arguments setup, so malloc will appear to have
no parameters or returns (unless that information is provided by
link-time-information, debug information, or name demangling), but it
needs to have the same parameter types and return type as
malloc@plt. The propagateWrapped methods copy the argument information
from the DataFlowEntry of the wrapping function to the wrapped
function. Then, when the calls to the wrapping function are inlined
(in connectWrappers), effectively the call to the wrapping function is
changed into a call to the wrapped function.

The motivation for this change is the programs that analyze the
output of RetDec (either the C code, or the LLVM code) want to
recognize library functions and treat them specially. This
change makes it so that the library function names are used
directly (rather than the plt version) and they are passed
their parameters correctly.
As Capstone was updated, the fix in capstone-engine/capstone#968 took effect and the original RetDec fix is not needed - in fact, it caused problems.
@PeterMatula
Copy link
Collaborator

@richardlford thanks a lot for this, you did what I was putting off for a long time. Especially that Capstone patch is not an easy thing to figure out.

I did one more change in 2771f85 as this code was causing problems once the issue in Capstone was fixed.

@PeterMatula PeterMatula merged commit ef27550 into avast:master Jul 21, 2022
PeterMatula pushed a commit to avast/retdec-regression-tests that referenced this pull request Jul 21, 2022
Capstone version 4.0.2 has a bug when disassembling a powerpc instruction
with a signed 16-bit immediate.
See capstone-engine/capstone#1746 and
capstone-engine/capstone#1746 (comment).

RetDec PR avast/retdec#1086 moved RetDec up to
version 4.0.2 of Capstone, but adds a patch to fix this bug. The
fix includes not printing the immediates in hex (a later version
of Capstone that fixes this problem no long prints the immediates in hex).
This change fixes a test to agree with the new behavior.
@PeterMatula
Copy link
Collaborator

Damn, by accident I did Rebase and merge which was not the best option for this PR. Hope you don't mind. I can try to revert it and do it again, but I'm afraid that would only further complicate things.

@richardlford richardlford deleted the capstone-4.0.2 branch July 21, 2022 14:04
PeterMatula pushed a commit to avast/retdec-regression-tests that referenced this pull request Jul 21, 2022
These test binaries contain endbr32 and endbr64 instructions,
so this PR should be committed after RetDec has moved
to Capstone 4.0.2 (avast/retdec#1086).
@richardlford
Copy link
Contributor Author

richardlford commented Jul 21, 2022

Damn, by accident I did Rebase and merge which was not the best option for this PR. Hope you don't mind. I can try to revert it and do it again, but I'm afraid that would only further complicate things.

@PeterMatula, No problem. You can leave it as it is.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants