Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deduplicate procmems by sha256 hash #119

Closed
wants to merge 2 commits into from
Closed

Conversation

msm-cert
Copy link
Member

@msm-cert msm-cert commented Apr 4, 2024

This solves our "problems" with binaries that are submitted as copies of themselves, like b = open("x.exe").read(); b = b * 100.

bb4c7d48773b21c62885d0206c9414176c254e6104b4a0ffe730aa570b424948
21a479ce141d62b5920b3f76ece6d2b4a58c7f25afc8751e161a0fdf44a0197f
f5c9c598101a49a5c60f174fe7e7946c3f73c7c51b39ae5120560494d80db168

This is not an elegant fix for many reasons:

  • We're defending about a specific class of malformed binaries, this is not really an obvious improvement (even though procmem deduplication makes sense in all cases, it's not expected to be useful often)
  • Procmems with the same sha256 may behave differently because of subtle imagebase differences (I think i could prepare a toy examples where two procmems with the same .m hash behave differently because of imagebase.

It will hopefully stop us from OOMing, but it's not a critical fix.

@msm-cert msm-cert requested a review from psrok1 April 4, 2024 14:01
@yankovs
Copy link
Contributor

yankovs commented Apr 8, 2024

Hey! :)
Just stumbled upon this PR and wanted to share that recently I've been experimenting with dumps from emulation and came across (what I assume is) this behavior, same samples only difference being imagebase. I'll be happy to share hashes of such samples, if you need

@psrok1
Copy link
Member

psrok1 commented Apr 19, 2024

It looks like it's not really b*100 but binary were not correctly carved and m was pointing 100 times at the beginning of the buffer. I try to fix this in #122.

@psrok1 psrok1 closed this Apr 19, 2024
This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants