-
Notifications
You must be signed in to change notification settings - Fork 944
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add ability to process PNG icons for perceptual hash calculation #1090
Conversation
this->image.push_back(row); | ||
} | ||
|
||
stbi_image_free(data); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have a ScopeExitGuard
in utils/scope_exit.h
. Check it out. I would utilize it here as it should provide more safety when it comes to exceptions etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, TIL. I am not 100% if I used it correctly, but I've tried to utilize it.
* Add ability to process PNG icons for perceptual hash calculation * Use SCOPE_EXIT for deallocation
* Update Capstone to v4.0 * [Capstone-next] Update to capstone-next branch * [Capstone-next] Update to Capstone-Next Branch -[ARM] -Add ARM_INS_MOVS support -[ARM64] -Remove vess. -It overlaps with ARM64_VAS -Fix A64SysReg_* into ARM64_SYSREG_* -[PowerPC] -Fix PPC_REG_X2 into PPC_REG_XER -[X86] -Remove X86_INS_FADDP -In capstone-next, faddp is actually fadd, both belong to "ID 15(fadd)" * [tests][capstone2llvmir][arm] Fix MOVW Unit Test - In test, "movw r0, #0xabcd" do not read any register and the result is 0xabcd not 0x1234abcd * [tests][capstone2llvmir][arm] Fix Nop test - In arm, the NOP instruction is HINT instruction - Also, in capstone, the cs_insn->id of nop is point to HINT(ID: 63) - So, an error will be occurred when looking for a translate instruction method because it is points to nullptr * [Capstone2llvmir][arm64] Add ADDCS Support * [capstone2llvmir][arm64] Add ADDS Support * [capstone2llvmir][arm64] Add ANDS Support * [capstone2llvmir][arm64] Add SUP Support * [capstone2llvmir][arm64] Add BICS Support * [capstonellvmir][PowerPC] Update Register Name * [capstone2llvmir][PowerPC] Update Register Name * [capstone2llvmir][PowerPC] Fix CMP Support * [capstone2llvmir][PowerPC] Add CMPL Support * [capstone2llvmir][PowerPC] Fix CMPL * [capstone2llvmir][PowerPC] Add BLT Support * [capstone2llvmir][PowerPC] Add Branch mnemonics incorporating conditions Suppport * [capstone2llvmir][PowerPC] Fix RLWINM - RLWINM and clrlwi are same ID * [tests][capstone2llvmir][PowerPC] Fix Crand Tests * [capstone2llvmir][PowerPC] Fix bdzla BUG * [capstone2llvmir][PowerPC] Remove BDZLA TODO * [capstone2llvmir][x86] Fix ud2b * [capstone2llvmir][X86] Fix FADD/FADDP * [capstone2llvmir][x86] Fix FADD/FADDP * [capstone2llvmir][x86] Fix FXCH - when transalte "FXCH instruction, in the value of loadOpFloatingBinaryTop Function, "top" is equal to idx, which causes the value to be written to top twice when exchanging data. * clean code * Update Capstone to v5.0 * [capstone2llvmir][x86][PowerPC] Clean code * [capstone2llvmir][PowerPC] Clean code * [capstone2llvmir][PowerPC] Remove BUN* and BNU* -In CapstoneV5, they are both equivalent to BSO* and BNS* * [capstone2llvmir][PowerPC] Fix rlwinm - In capstone V5, rlwinm is equivalent to to clrlwi * [capstone2llvmir][PowerPC] Fix BNL* * [capstone2llvmir][PowerPC] Add PPC_REG_ZERO * [capstone2llvmir][PowerPC] Add comment * Fix merge conflict * Update YARA to 4.2.X * Add dll_name from export directory to output * llvm/CMakeLists: Manually-specified variables were not used by the project. The following variables were set in CMakeLists, however, they were not used by the LLVM project build: LLVM_USE_CRT_DEBUG LLVM_USE_CRT_RELEASE * CHANGELOG.md: add entries for #1060 #1061 PRs * Fixed loading import directory that is modified by relocations * Fixed comment * Remove useless trailing whitespace There is absolutely no reason for it being in the code. * pelib: Fix a typo in a comment in PeLib::ImageLoader::Load() * Add a CHANGELOG entry for #1063 * Move signing certificate to separate object * Updated authenticode parser to the newest version * Fix uninitialize free, use finer sanity checks in auth. parser * Add a directory for RetDec-related publications The list of publications has been originally placed on https://retdec.com/publications/ (https://retdec.com/ has been redirected to https://github.com/avast/retdec, and we wanted to keep the list somewhere). * Fix the wording for an invalid max-memory error in scripts/retdec-unpacker.py There are the following two reasons for the fix: - The check only verifies whether the passed value is an integer. - The parameter can be 0 (i.e. a non-negative integer). It does not have to a positive integer. * Never try to limit memory on macOS We can't limit memory on macOS. Before macOS 12 limitSystemMemoryOnPOSIX() does not actually do anything on macOS. Anyway, it just succeed, since macOS 12 it returns error and retdec can't start. To be honest Apple can control memmory limit via so-called the ledger() system call which is private. An old verison which was opened to OpenSource (from 10.9-10.10?) using setrlimit() but at some point setrlimit() was broken and not ledger(). Probably at macOS 12 the setrlimit() was completley broken. Because we haven't got any other choose just return true which haven't change anything. See: #379 Fixes: #1045 * Remove a redundant period from CHANGELOG * utils: Improve the wording of a comment in getTotalSystemMemoryOnMacOS() * Add a CHANGELOG entry for #1074 and #1045 * Update authenticode-parser, use-after-free, signedness issues * Using multistage build for Dockerfile, reduces container size by ~1.5G * Check for possible overflow when checking for segment overlaps. Fix incorrect range exception message * Fix parameter and return types for dynamically called functions Calls to dynamically-linked functions go through the procedure linkage table (PLT). RetDec turns a PLT entry into a function, say malloc@plt, that appears to do nothing but call the external function, say malloc (though the assembly code will do a jump rather than a call). User code that logically wants to call malloc instead calls malloc@plt (and sets up arguments as if calling malloc). The malloc@plt code first jumps to the dynamic linker which modifies it so that subsequent calls to malloc@plt will jump directly to malloc. We say that malloc@plt wraps malloc. The call to malloc in malloc@plt will not have any arguments setup, so malloc will appear to have no parameters or returns (unless that information is provided by link-time-information, debug information, or name demangling), but it needs to have the same parameter types and return type as malloc@plt. The propagateWrapped methods copy the argument information from the DataFlowEntry of the wrapping function to the wrapped function. Then, when the calls to the wrapping function are inlined (in connectWrappers), effectively the call to the wrapping function is changed into a call to the wrapped function. The motivation for this change is the programs that analyze the output of RetDec (either the C code, or the LLVM code) want to recognize library functions and treat them specially. This change makes it so that the library function names are used directly (rather than the plt version) and they are passed their parameters correctly. * Upgrade to Capstone release 4.0.2 * Add additional patch on capstone 4.0.2 for PPC Signed 16 bit immediates Capstone version 4.0.2 has a bug when disassembling a powerpc instruction with a signed 16-bit immediate. See capstone-engine/capstone#1746 and capstone-engine/capstone#1746 (comment). This change adds to the capstone patch to fix this problem. * Treat endbr32/endbr64 instructions as NOPs * capstone2llvmir/powerpc: remove PPC_INS_BDZLA hack fix As Capstone was updated, the fix in capstone-engine/capstone#968 took effect and the original RetDec fix is not needed - in fact, it caused problems. * Handle Procedure Linkage calls for 32bit x86 from gcc This case is for x86 32 bit compiled with GCC. Its PLT entries are in sections .plt.sec or .plt.got. An entry is of the form: jmp *offset(%ebx) When this code is encountered register %ebx has been loaded with the address of the start of the Global Offset Table (.got) section. This change handles that case. * Add ability to process PNG icons for perceptual hash calculation (#1090) * Add ability to process PNG icons for perceptual hash calculation * Use SCOPE_EXIT for deallocation * In generated C, add prototypes for dynamically-linked functions without headers When the program involves dynamically-linked functions like _Znwj (operator new) that return a pointer, it is necessary to have prototypes for them, since otherwise they will be implicitly deduced to return "int" which cannnot be dereferenced. Previously RetDec was emitting comments telling which functions were dynamically linked. This change moves them up before the functions are emitted and instead emits prototypes for the functions. However, RetDec also inserts includes of headers for functions for with know headers. We do not emit prototypes for functions with headers as that would be redundant. As a result, some dynamically-linked functions that used to show in the comments no longer appear as the included header will declare them. The section header comment for dynamically-linked functions is only produced if some prototypes are written for dynamically-linked functions. A related PR will have added tests as well as changes needed for existing tests. * Add printing of analysis time to retdec-fileinfo output * Yara: inherits linker flags * Use provided libtool via `CMAKE_LIBTOOL` * Added missed `${RETDEC_INSTALL_BIN_DIR}` to `pat2yara` * Added sanity check for page index when loading pages from broken samples There are certain samples where page index might go beyond available pages when trying to load them which will be prevented with this patch. * Virtual Size overflow is now handler properly * Fixed error code * Updated yaramod * Fix removeZeroSequences * README.md: add "limited maintenance mode" note Co-authored-by: Peter Kubov <peter.kubov@avast.com> Co-authored-by: houndthe <houndthe@protonmail.com> Co-authored-by: Peter Matula <peter.matula@avast.com> Co-authored-by: Ladislav Zezula <ladislav.zezula@avast.com> Co-authored-by: Petr Zemek <petr.zemek@avast.com> Co-authored-by: Marek Milkovič <marek.milkovic@avast.com> Co-authored-by: Kirill A. Korinsky <kirill@korins.ky> Co-authored-by: me <me> Co-authored-by: Richard L Ford <richardlford@gmail.com> Co-authored-by: 未赢 <26459963+neverwin@users.noreply.github.com>
* Update Capstone to v4.0 * [Capstone-next] Update to capstone-next branch * [Capstone-next] Update to Capstone-Next Branch -[ARM] -Add ARM_INS_MOVS support -[ARM64] -Remove vess. -It overlaps with ARM64_VAS -Fix A64SysReg_* into ARM64_SYSREG_* -[PowerPC] -Fix PPC_REG_X2 into PPC_REG_XER -[X86] -Remove X86_INS_FADDP -In capstone-next, faddp is actually fadd, both belong to "ID 15(fadd)" * [tests][capstone2llvmir][arm] Fix MOVW Unit Test - In test, "movw r0, #0xabcd" do not read any register and the result is 0xabcd not 0x1234abcd * [tests][capstone2llvmir][arm] Fix Nop test - In arm, the NOP instruction is HINT instruction - Also, in capstone, the cs_insn->id of nop is point to HINT(ID: 63) - So, an error will be occurred when looking for a translate instruction method because it is points to nullptr * [Capstone2llvmir][arm64] Add ADDCS Support * [capstone2llvmir][arm64] Add ADDS Support * [capstone2llvmir][arm64] Add ANDS Support * [capstone2llvmir][arm64] Add SUP Support * [capstone2llvmir][arm64] Add BICS Support * [capstonellvmir][PowerPC] Update Register Name * [capstone2llvmir][PowerPC] Update Register Name * [capstone2llvmir][PowerPC] Fix CMP Support * [capstone2llvmir][PowerPC] Add CMPL Support * [capstone2llvmir][PowerPC] Fix CMPL * [capstone2llvmir][PowerPC] Add BLT Support * [capstone2llvmir][PowerPC] Add Branch mnemonics incorporating conditions Suppport * [capstone2llvmir][PowerPC] Fix RLWINM - RLWINM and clrlwi are same ID * [tests][capstone2llvmir][PowerPC] Fix Crand Tests * [capstone2llvmir][PowerPC] Fix bdzla BUG * [capstone2llvmir][PowerPC] Remove BDZLA TODO * [capstone2llvmir][x86] Fix ud2b * [capstone2llvmir][X86] Fix FADD/FADDP * [capstone2llvmir][x86] Fix FADD/FADDP * [capstone2llvmir][x86] Fix FXCH - when transalte "FXCH instruction, in the value of loadOpFloatingBinaryTop Function, "top" is equal to idx, which causes the value to be written to top twice when exchanging data. * clean code * Update Capstone to v5.0 * [capstone2llvmir][x86][PowerPC] Clean code * [capstone2llvmir][PowerPC] Clean code * [capstone2llvmir][PowerPC] Remove BUN* and BNU* -In CapstoneV5, they are both equivalent to BSO* and BNS* * [capstone2llvmir][PowerPC] Fix rlwinm - In capstone V5, rlwinm is equivalent to to clrlwi * [capstone2llvmir][PowerPC] Fix BNL* * [capstone2llvmir][PowerPC] Add PPC_REG_ZERO * [capstone2llvmir][PowerPC] Add comment * Fix merge conflict * Update YARA to 4.2.X * Add dll_name from export directory to output * llvm/CMakeLists: Manually-specified variables were not used by the project. The following variables were set in CMakeLists, however, they were not used by the LLVM project build: LLVM_USE_CRT_DEBUG LLVM_USE_CRT_RELEASE * CHANGELOG.md: add entries for #1060 #1061 PRs * Fixed loading import directory that is modified by relocations * Fixed comment * Remove useless trailing whitespace There is absolutely no reason for it being in the code. * pelib: Fix a typo in a comment in PeLib::ImageLoader::Load() * Add a CHANGELOG entry for #1063 * Move signing certificate to separate object * Updated authenticode parser to the newest version * Fix uninitialize free, use finer sanity checks in auth. parser * Add a directory for RetDec-related publications The list of publications has been originally placed on https://retdec.com/publications/ (https://retdec.com/ has been redirected to https://github.com/avast/retdec, and we wanted to keep the list somewhere). * Fix the wording for an invalid max-memory error in scripts/retdec-unpacker.py There are the following two reasons for the fix: - The check only verifies whether the passed value is an integer. - The parameter can be 0 (i.e. a non-negative integer). It does not have to a positive integer. * Never try to limit memory on macOS We can't limit memory on macOS. Before macOS 12 limitSystemMemoryOnPOSIX() does not actually do anything on macOS. Anyway, it just succeed, since macOS 12 it returns error and retdec can't start. To be honest Apple can control memmory limit via so-called the ledger() system call which is private. An old verison which was opened to OpenSource (from 10.9-10.10?) using setrlimit() but at some point setrlimit() was broken and not ledger(). Probably at macOS 12 the setrlimit() was completley broken. Because we haven't got any other choose just return true which haven't change anything. See: #379 Fixes: #1045 * Remove a redundant period from CHANGELOG * utils: Improve the wording of a comment in getTotalSystemMemoryOnMacOS() * Add a CHANGELOG entry for #1074 and #1045 * Update authenticode-parser, use-after-free, signedness issues * Using multistage build for Dockerfile, reduces container size by ~1.5G * Check for possible overflow when checking for segment overlaps. Fix incorrect range exception message * Fix parameter and return types for dynamically called functions Calls to dynamically-linked functions go through the procedure linkage table (PLT). RetDec turns a PLT entry into a function, say malloc@plt, that appears to do nothing but call the external function, say malloc (though the assembly code will do a jump rather than a call). User code that logically wants to call malloc instead calls malloc@plt (and sets up arguments as if calling malloc). The malloc@plt code first jumps to the dynamic linker which modifies it so that subsequent calls to malloc@plt will jump directly to malloc. We say that malloc@plt wraps malloc. The call to malloc in malloc@plt will not have any arguments setup, so malloc will appear to have no parameters or returns (unless that information is provided by link-time-information, debug information, or name demangling), but it needs to have the same parameter types and return type as malloc@plt. The propagateWrapped methods copy the argument information from the DataFlowEntry of the wrapping function to the wrapped function. Then, when the calls to the wrapping function are inlined (in connectWrappers), effectively the call to the wrapping function is changed into a call to the wrapped function. The motivation for this change is the programs that analyze the output of RetDec (either the C code, or the LLVM code) want to recognize library functions and treat them specially. This change makes it so that the library function names are used directly (rather than the plt version) and they are passed their parameters correctly. * Upgrade to Capstone release 4.0.2 * Add additional patch on capstone 4.0.2 for PPC Signed 16 bit immediates Capstone version 4.0.2 has a bug when disassembling a powerpc instruction with a signed 16-bit immediate. See capstone-engine/capstone#1746 and capstone-engine/capstone#1746 (comment). This change adds to the capstone patch to fix this problem. * Treat endbr32/endbr64 instructions as NOPs * capstone2llvmir/powerpc: remove PPC_INS_BDZLA hack fix As Capstone was updated, the fix in capstone-engine/capstone#968 took effect and the original RetDec fix is not needed - in fact, it caused problems. * Handle Procedure Linkage calls for 32bit x86 from gcc This case is for x86 32 bit compiled with GCC. Its PLT entries are in sections .plt.sec or .plt.got. An entry is of the form: jmp *offset(%ebx) When this code is encountered register %ebx has been loaded with the address of the start of the Global Offset Table (.got) section. This change handles that case. * Add ability to process PNG icons for perceptual hash calculation (#1090) * Add ability to process PNG icons for perceptual hash calculation * Use SCOPE_EXIT for deallocation * In generated C, add prototypes for dynamically-linked functions without headers When the program involves dynamically-linked functions like _Znwj (operator new) that return a pointer, it is necessary to have prototypes for them, since otherwise they will be implicitly deduced to return "int" which cannnot be dereferenced. Previously RetDec was emitting comments telling which functions were dynamically linked. This change moves them up before the functions are emitted and instead emits prototypes for the functions. However, RetDec also inserts includes of headers for functions for with know headers. We do not emit prototypes for functions with headers as that would be redundant. As a result, some dynamically-linked functions that used to show in the comments no longer appear as the included header will declare them. The section header comment for dynamically-linked functions is only produced if some prototypes are written for dynamically-linked functions. A related PR will have added tests as well as changes needed for existing tests. * Add printing of analysis time to retdec-fileinfo output * Yara: inherits linker flags * Use provided libtool via `CMAKE_LIBTOOL` * Added missed `${RETDEC_INSTALL_BIN_DIR}` to `pat2yara` * Added sanity check for page index when loading pages from broken samples There are certain samples where page index might go beyond available pages when trying to load them which will be prevented with this patch. * Virtual Size overflow is now handler properly * Fixed error code * Updated yaramod * Fix removeZeroSequences * README.md: add "limited maintenance mode" note Co-authored-by: Peter Kubov <peter.kubov@avast.com> Co-authored-by: houndthe <houndthe@protonmail.com> Co-authored-by: Peter Matula <peter.matula@avast.com> Co-authored-by: Ladislav Zezula <ladislav.zezula@avast.com> Co-authored-by: Petr Zemek <petr.zemek@avast.com> Co-authored-by: Marek Milkovič <marek.milkovic@avast.com> Co-authored-by: Kirill A. Korinsky <kirill@korins.ky> Co-authored-by: me <me> Co-authored-by: Richard L Ford <richardlford@gmail.com> Co-authored-by: 未赢 <26459963+neverwin@users.noreply.github.com>
I've integrated https://github.com/nothings/stb
stb_image.h
to add loading of PNG icons for perceptual hash calculation. I've flipped the PNG image upside down, as the existing DIB format is stored upside down, and I wanted to align with the current implementation.I've handcrafted some test samples to the appropriate repository with mention of this PR. I took a sample with PNG Icon, extracted it in a different resolution, converted it to DIB format, and put it inside another file. Then I test if the perceptual hash of both file match.