Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

capstone2llvmir/x86: Add support of FPU instructions. #643

Merged
merged 12 commits into from
Sep 18, 2019

Conversation

JurajHolub
Copy link
Contributor

Implement all x87 instruction which is possible to represent by LLVM. Also, create unit tests for new instructions.

@PeterMatula PeterMatula self-requested a review September 11, 2019 11:28
Copy link
Collaborator

@PeterMatula PeterMatula left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks ok, but fix:

  1. indentation in unit tests
  2. add comments to some unit tests (some are ok)

Pull changes, before editing, because I pushed 2 commits fixing newlines and indent in some sources.

{X86_REG_ST5, 3.14}, // st(0)
{X86_REG_ST6, 15.7}, // st(1)
{X86_REG_CF, true},
});
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fix indents to match other indents in this file - here, and in all the other code you added - pretty much all setRegisters(), EXPECT_JUST_REGISTERS_STORED(), etc. are indented too much.
This wile uses tab indentation.

tests/capstone2llvmir/x86_tests.cpp Show resolved Hide resolved
@PeterMatula
Copy link
Collaborator

Looks ok, lets Run TeamCity builds.

@PeterMatula
Copy link
Collaborator

Well, some regression tests in TeamCity builds failed, but this is due to changes in regression tests suite and RetDec master branch - these repos changed, but changes were not merged to this PR. So this is probably nothing to worry about.

PeterMatula
PeterMatula previously approved these changes Sep 12, 2019
@PeterMatula
Copy link
Collaborator

This PR will solve #394.

@PeterMatula PeterMatula dismissed their stale review September 12, 2019 09:34

more things to do

@PeterMatula
Copy link
Collaborator

Looking at #394 ...

These instructions were implemented:

F2XM1
FCMOVcc
FFREE
FISTTP
FPREM
FPREM1
FNSTSW
FYL2X
FYL2XP1

Some are still not implemented - not all of them have to be, but it would be best to solve this once and for all, and if something is intentionally not translated, make sure there is an explanation and maybe even unit tests. So lets discuss them here:

  • FBLD:
    • Example: capstone-dumper -a x86 -t "fbld [0x1234]"
    • Current translation: retdec-capstone2llvmir -a x86 -t "fbld [0x1234]"
    %0 = load i80, i80* inttoptr (i32 4660 to i80*)
    call void @__asm_fbld(i80 %0)
    • This is changing FPU stack, so we should implement it as such.
    • I don't know if we are able to implement the binary conversion in a reasonably simple way - unless there is some LLVM intrinsic probably not. But we can combine pseudo ASM function call with FPU stack manipulation to get a translation that correctly modifies FPU stack:
    ; either pseudo asm call taking address, loading value, and doing conversion - all in one
    %1 = call x86_fp80 @__asm_fbld(x86_fp80* (i32 4660 to x86_fp80*))
    ; or probably even better - two step 1) load memory, 2) call pseudo function to do conversion
    ; if we use LLVM instruction to load memory, other analyses will know it is being loaded -> can work woth it.
    ; detail: pseudo function here is called the same as instruction, but it is only doing conversion, not the whole instruction.
    ; so maybe, it would be better to call it something else?
    %load = load x86_fp80, x86_fp80* inttoptr (i32 4660 to x86_fp80*)
    %1 = call x86_fp80 @__asm_fbld(x86_fp80 %load)
    ; rest ...
    %0 = load i3, i3* @fpu_stat_TOP
    %3 = sub i3 %0, 1
    store i3 %3, i3* @fpu_stat_TOP
    %4 = fcmp oeq x86_fp80 %1, 0xK00000000000000000000
    %5 = select i1 %4, i2 1, i2 0
    call void @6(i3 %3, i2 %5)
    call void @5(i3 %3, x86_fp80 %1)
  • FBSTP
    • The same as FBLD but in reverse. FPU stack operation should be translated.
  • FNCLEX
    • Current translation: retdec-capstone2llvmir -a x86 -t "fnclex"
    call void @__asm_fnclex()
    store i32 undef, i32* @fpsw
    • Probably ok. Or do you have some other idea? We probably don't need to implement this.
    • Take a look at translatePseudoAsm*() methods in capstone2llvmir_impl.h. They are meant to be used in translation table instead of nullptr and generic translation in those cases, when someone analyzed what an instruction does, decided that we don't want to implement a full semantics for it, but wants to show that pseudo translation was intentionally chosen - not just default nullptr. Different flavors allow to further specify data flow of instruction- i.e. if there are operands, if it is writing something. Fell free to add more of these methods, if you need one that is not implemented. Using this, maybe the translation could be implemented even better (if it makes sense - you decide):
    %0 = call i32 @__asm_fnclex()
    store i32 %0, i32* @fpsw
    • Whatever you end up doing with it (including leaving it as is), write an unit test. It will show that we consciously decided to implement it a certain way, and it will stop someone from unintentionally changing it in future.
  • FLDCW
    • Current translation: NOP.
    • Is this intentional? Or just an old relic? Do we represent FPU control word and manipulate it in other instructions? If yes, this should be translated in some way. If not, it may be ignored. It is quite possible this was set to NOP because generating a pseudo ASM call for it will screw up some regression tests (most likely from integration subset) - if instruction is in tests that recompile the RetDec output, pseudo ASM call cannot be compiled and the whole thing fails. But this should not stop us from implementing it if it makes sense - and if other instructions working with FPU control word are implemented. We can somehow fix these tests to make them pass even if instruction is translated.
    • Again, whatever you do (including leaving it as NOP), write an unit test.
  • FLDENV
    • Current translation: retdec-capstone2llvmir -a x86 -t "FLDENV [0x1234]"
    %0 = load i224, i224* inttoptr (i32 4660 to i224*)
    call void @__asm_fldenv(i224 %0)
    • More or less ok - write an unit test (check that memory is being loaded).
    • It could be made even better, but I'm not sure it is worth it - e.g. data type structure representing loaded data could be defined and used instead of i224. Maybe some other things.
  • FRSTOR, FNSAVE, FNSTENV, FNSTCW, FXSAVE, FXSAVE64, FXRSTOR, FXRSTOR64
    • Similar to FLDENV. Think about making it a little better according to specs - if possible and not too complicated.
    • Write unit tests.
  • FPTAN, FPATAN
    • Current translation:
    call void @__asm_fptan()
    call void @__asm_fpatan()
    • These operate (i.e. change) FPU stack, so at least these operations should be implemented. If it is not possible to easily express the operation itself (e.g. LLVM intrinsic), use pseudo function call. But the FPU stack load/stores should be created.
    • Unit tests.
  • FSCALE, FXTRACT, FXAM
    • Similar to FPTAN - we don't have to translate full semantics, but if it operates on FPU stack, implement its loads/stores and use intrinsic/pseudo call.

Basically:

  • Always do load/stores of FPU stack.
  • Use intrinsics/pseudo calls if we cannot reasonably represent operation.
  • Don't use nullptr in translation table - even if the default is used, specify it to make it clear this was a conscious decision.
  • Write unit test for everything - including pseudo ASM calls.

…nstruction to llvmir.

FBLD, FBSTP, FNCLEX, FLDCW, FLDENV, FRSTOR, FNSAVE, FNSTENV, FNSTCW, FXSAVE, FXSAVE64, FXRSTOR, FXRSTOR64, FPTAN, FPATAN, FSCALE, FXTRACT, FXAM.
@PeterMatula PeterMatula merged commit d3c74f3 into avast:master Sep 18, 2019
@PeterMatula
Copy link
Collaborator

I tested it manually because the last TeamCity builds failed - there are some changes in regression tests in master branch and this branch cannot be run against it.
It passed on my Linux machine, so I hope it will be ok everywhere - we will see in a few moments when TeamCity finishes master builds triggered by the merge.

Great work.
Thank you very much for this.

PeterMatula added a commit that referenced this pull request Sep 18, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants