Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Mach-O parsing in Mach-O module #1263

Closed
wants to merge 1 commit into from
Closed

Fix Mach-O parsing in Mach-O module #1263

wants to merge 1 commit into from

Conversation

PeterMatula
Copy link

@PeterMatula PeterMatula commented Apr 27, 2020

The problem

In MACHO_PARSE_FILE() there is a command parsing loop. When segment commands are parsed, seg_count is incremented. At the end, number_of_segments integer is set. But also the LC_UNIXTHREAD and LC_MAIN commands are parsed in the loop. Their parsing functions use uninitialized number_of_segments value - in macho_rva_to_offset() and macho_offset_to_rva(). This is clearly not safe, but it does not cause problems most of the times. If the binaries are not malformed, rva <-> offset conversion succeeds even when number_of_segments is some (random) huge value, because it hits in the first few good segments. However, it the binary is not so correct, this can cause iteration over up to uint64_t max value.

The example

Run YARA (compiled with --enable-macho) on the following file test.zip with the following ruleset:

import "macho"

If you are not (un)lucky, this will take a very long time because there is some huge number in the uninitialized number_of_segments. E.g. 18445260673521474303 in my case.

The solution

A 2-step commands processing:

  • First only the segments are processed.
  • Then other commands that use info on the number of segments.

Cons: two steps.
Pros: value is initialized, and is a general solution.

Less general solution would be to keep the single loop and set number_of_segments after every segment command is processed. This would however rely on specific ordering of commands - i.e. commands using segment N go only after command for segment N itself. Otherwise it would not work. I don't know if this can be expected, so its better to be safe than sorry and loop twice.

@mbandzi can you please take a look at it? @metthal said you are the original author.

Make sure "number_of_segments" is set before it is used.
@googlebot
Copy link

Thanks for your pull request. It looks like this may be your first contribution to a Google open source project (if not, look below for help). Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

📝 Please visit https://cla.developers.google.com/ to sign.

Once you've signed (or fixed any issues), please reply here with @googlebot I signed it! and we'll verify it.


What to do if you already signed the CLA

Individual signers
Corporate signers

ℹ️ Googlers: Go here for more info.

@PeterMatula
Copy link
Author

@googlebot I signed it!

@googlebot
Copy link

CLAs look good, thanks!

ℹ️ Googlers: Go here for more info.

@mbandzi
Copy link
Contributor

mbandzi commented Apr 27, 2020

Looks good to me. As far as I remember, the command order is not guaranteed in Mach-O so I would keep two loops.

@knightsc
Copy link
Contributor

knightsc commented May 3, 2020

@PeterMatula Do you know if this problem also exists in my branch here?

#1038

I've re-organized the Mach-O code in that branch and I'm curious if the issue still exists.

Update: Just re-read your description more closely and it looks like I would probably want to make this same change in my branch as well.

@PeterMatula
Copy link
Author

Btw, I don't think that that continuous-integration/appveyor fail is caused by my changes. It is not even in a file I touched.

@knightsc knightsc mentioned this pull request May 4, 2020
@knightsc
Copy link
Contributor

knightsc commented May 4, 2020

@PeterMatula I updated your bug fix to work on the updated mach-o parsing code that was just merged.

#1272

I also included a test case in that PR.

@PeterMatula
Copy link
Author

@knightsc does this mean #1272 has the changes from my (this) PR and the bug is fixed in there? If so, and I see #1272 was already merged, then I can close this and consider it solved.

@knightsc
Copy link
Contributor

@PeterMatula , yes, you are correct. Your bug fix should be committed now. Good find and fix!

@PeterMatula
Copy link
Author

Great, thanks for getting it to master.

@PeterMatula PeterMatula deleted the macho-module-fix branch May 18, 2020 22:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants