Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ARM: add support for arm architecture #1796

Closed
wants to merge 5 commits into from
Closed

ARM: add support for arm architecture #1796

wants to merge 5 commits into from

Conversation

gauthier-wiemann
Copy link

The suggested enhancement involves adding support for ARM architecture in the CAPA tool and improving its capabilities on ELF files.

Here are the steps involved:

  1. Changes to capa/features/common.py and capa/main.py: Simple changes have been made to add ARM architecture to the list of supported architectures in CAPA.

  2. Changes to capa/features/extractors/viv/: More complex changes have been made to this directory to enable disassembly of ARM binaries.
    Specifically, a new file, capa/features/extractors/viv/insn_arm.py, has been added.
    This file replicates the logic of capa/features/extractors/viv/insn.py, but adapts its functions to ARM mnemonics and patterns specific to ARM binaries.

  3. Testing: Each feature has been tested using binaries written in assembler or compiled with GCC (version 9.4.0), and the final result of the modifications has been evaluated over Linux Arm Malware samples, mostly Mirai.

  4. Improving capabilities on ELF files: The modifications also aim to improve CAPA's capabilities on ELF files. Specifically, the following changes have been made:

    • ELF Stripper: CAPA's current version fails to detect the underlying operating system in ELF files, leading to premature termination of the analysis. To address this, Linux is assumed to be the default operating system if the analysis couldn't determine the underlying OS. This change is made in capa/features/extractors/elf.py.
    • Statically linked ELF files: The extractor was unable to yield the API features in statically linked ELF files, resulting in poor detection capabilities. To solve this problem, upstream detection of statically linked ELF files is performed, and certain basic blocks are marked as library function using a hard-coded table to translate the syscall number into their function name. This change is made in capa/features/extractors/viv/syscall.py.
Note:
  • While adding support for ARM architecture in CAPA, it was found that some ARM samples produce a lot of warnings in the provenance of the vivesct disassembler. However, these warnings do not seem to interfere with the final result of CAPA's analysis.

  • As for x86/x84 binaries, it was observed that the statement of some features is dependent on specific patterns, which may be related to the type and version of the compiler used. This dependency could potentially limit CAPA's abilities when analyzing these binaries.

Checklist

  • No CHANGELOG update needed
  • No new tests needed
  • No documentation update needed

Are you ok to add this features in a future version? We haven't looked yet how to add tests. Do we need to write them?

@google-cla
Copy link

google-cla bot commented Sep 26, 2023

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

@williballenthin williballenthin requested review from mike-hunhoff, mr-tz and williballenthin and removed request for mike-hunhoff and mr-tz September 26, 2023 13:02
@williballenthin
Copy link
Collaborator

Hi @gauthier-wiemann, welcome to the capa community!

I'm so excited to see this contribution and the work that you've done. The explanation in the PR text gives a nice background on what we can expect to see in the code. I think that everyone on the team looks forward to reviewing the changes.

Just to set expectations, we're in a busy period right now as we prepare to deliver a workshop about capa at Brucon (are you going!?). So, we'll be a little quiet until next week. Then, I hope that we can work together to get these changes merged into master!

We haven't looked yet how to add tests. Do we need to write them?

Yes. I'm not yet sure how thorough they need to be (i.e., we have hundreds for x86 code features, but I'm not sure it's fair to require 1:1 tests for Arm immediately). Please consider if you can share the example binaries that you created/collected and we can incorporate them into the testfiles repository.

@williballenthin
Copy link
Collaborator

ref #1774

@lastinfosec lastinfosec closed this by deleting the head repository Oct 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants