Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug in bin2llvmir Decoder #638

Closed
seviezhou opened this issue Sep 4, 2019 · 3 comments
Closed

Bug in bin2llvmir Decoder #638

seviezhou opened this issue Sep 4, 2019 · 3 comments

Comments

@seviezhou
Copy link
Contributor

Deocde the following ELF file:

elf-Linux-x64-bash.zip

We get a strange function:

declare i64 @__strtol_internal()
...
define i64 @function_ffffffffffff94e5() {
dec_label_pc_ffffffffffff94e5:
  ret i64 undef
}

The problem is, if the jump target is IMPORT Type, we shouldn't decode it, if the data accidently be meaningful code, We may get the following result:

        define void @__strtol_internal() {
        block_6d94a0:
          store volatile i64 7181472, i64* @assembly_address
          call void @__pseudo_branch(i64 -27419)
          call void @function_ffffffffffff94e5()
          ret void
        }

The data in 0x6d94a0 is (IDA view):

.got.plt:00000000006D94A0 off_6D94A0      dq offset __strtol_internal

Although __strtol_internal will become declaration by the end of Decode phase, the wrong function function_ffffffffffff94e5 will remaining:

declare i64 @__strtol_internal()
...
define i64 @function_ffffffffffff94e5() {
dec_label_pc_ffffffffffff94e5:
  ret i64 undef
}

It can be solved by add the following code to Decoder::decodeJumpTarget(JumpTarget &jt), after line 189:

void Decoder::decodeJumpTarget(const JumpTarget& jt)
{
	const Address start = jt.getAddress();
	if (start.isUndefined())
	{
		LOG << "\t\t" << "unknown target address -> skip" << std::endl;
		return;
	}
        
+        if (jt.getFromAddress().isUndefined() && jt.getType() == JumpTarget::eType::IMPORT)
+       {
+              return;
+       }

And everything will be fine.

@xkubov
Copy link
Contributor

xkubov commented Sep 9, 2019

Hi, I tried to decompile the file you have provided and with your fix I've managed to get the same results.

In this case, RetDec decompiles functions that should not be included in the output after the decoder phase. Those functions are eliminated during decompilation but few artifacts stay present.

Here is a diff of LLVM code after decoding phase (before and after the patch):
diff.txt

I've run regression tests and all passed. I think that you can open a pull request.

seviezhou added a commit to seviezhou/retdec that referenced this issue Sep 9, 2019
@PeterMatula
Copy link
Collaborator

Thanks for the report, this is an issue that should be fixed. I will look into your solution in #642 and decide if it is enough, or we should do some more things. This might require a more robust solution.

@PeterMatula PeterMatula self-assigned this Sep 16, 2019
@PeterMatula PeterMatula added this to the RetDec v4 milestone Sep 16, 2019
@PeterMatula PeterMatula modified the milestones: RetDec v4, RetDec v4+ Apr 8, 2020
PeterMatula pushed a commit that referenced this issue Dec 7, 2022
@PeterMatula
Copy link
Collaborator

Solved by #642.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants