Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AArch64/ARM64] Update to Capstone v6/auto-sync #4011

Merged
merged 1 commit into from
Dec 19, 2023
Merged

Conversation

Rot127
Copy link
Member

@Rot127 Rot127 commented Nov 30, 2023

Your checklist for this pull request

  • I've read the guidelines for contributing to this repository
  • I made sure to follow the project's coding style
  • I've documented or updated the documentation of every function and struct this PR changes. If not so I've explained why.
  • I've added tests that prove my fix is effective or that my feature works (if possible)
  • I've updated the rizin book with the relevant information (if needed)

Detailed description

Updates the AArch64/ARM64 module to Capstone's v6/auto-sync version.

Requires capstone-engine/capstone#2026 to be merged

Test plan

  • Test build with different CS ver. < 6
  • All green

Closing issues

Closes #3175

Copy link
Member

@XVilka XVilka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wargio @thestr4ng3r, please take a good look too.

@@ -75,7 +75,7 @@ EXPECT=<<EOF
0x70 (set x10 (cast 64 false (let res (cast 8 false (>> (cast 32 false (var x14)) (bv 6 0x10) false)) (cast 32 false (var res)))))
0x74 (set x9 (+ (var x8) (bv 64 0x800)))
0x78 (set x0 (cast 64 false (loadw 0 32 (+ (var x9) (<< (cast 64 false (cast 32 false (var x10))) (bv 6 0x2) false)))))
0x7c (set x3 (cast 64 false (>> (cast 32 false (var x13)) (bv 5 0x18) false)))
0x7c (set x3 (cast 64 false (>> (cast 32 false (var x13)) (bv 6 0x18) false)))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you check the reason why this change happened? bv 5 0x18 becomes bv 6 0x18 in many places

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because alias are no longer handled as "pure" instructions the decoding path is a different one. These ones are usually LSR which are an alias for UBFM. And the immediate of the UBFM instruction is 6bit. Hence the correction.

test/db/cmd/cmd_plf Show resolved Hide resolved
test/db/cmd/cmd_plf Show resolved Hide resolved
test/db/cmd/cmd_plf Show resolved Hide resolved
@XVilka XVilka added this to the 0.7.0 milestone Dec 3, 2023
XVilka

This comment was marked as resolved.

Copy link
Member

@wargio wargio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@github-actions github-actions bot added the API label Dec 10, 2023
XVilka

This comment was marked as resolved.

@Rot127
Copy link
Member Author

Rot127 commented Dec 12, 2023

Waiting for capstone-engine/capstone#2216
Hopefully this is finally it.

@XVilka

This comment was marked as resolved.

@Rot127
Copy link
Member Author

Rot127 commented Dec 12, 2023

They do! Pretty sure they weren't very correct before.

@imbillow imbillow mentioned this pull request Dec 14, 2023
5 tasks
@XVilka

This comment was marked as resolved.

@imbillow

This comment was marked as resolved.

@XVilka

This comment was marked as resolved.

@XVilka
Copy link
Member

XVilka commented Dec 16, 2023

Only one broken test left:

[XX] db/cmd/types bashbot test (arm 32-bits)
RZ_NOPLUGINS=1 /usr/bin/rizin -escr.utf8=0 -escr.color=0 -escr.interactive=0 -eflirt.sigdb.load.system=false -eflirt.sigdb.load.home=false -N -Qc 's main
af
aaft
afvl
' bins/elf/bashbot.arm.gcc.O0.elf
-- stdout
--- expected
+++ actual
@@ -1,22 +1,22 @@
-var int32_t var_1454h @ stack - 0x1454
-var void *var_1438h @ stack - 0x1438
-var const char *s1 @ stack - 0x1038
-var int *wstatus @ stack - 0x44
+var int32_t var_1460h @ stack - 0x1460
+var void *var_1444h @ stack - 0x1444
+var const char *s1 @ stack - 0x1044
+var int *wstatus @ stack - 0x50
+var int32_t var_4ch @ stack - 0x4c
+var const char *var_48h @ stack - 0x48
+var int32_t var_44h @ stack - 0x44
 var int32_t var_40h @ stack - 0x40
-var const char *var_3ch @ stack - 0x3c
-var int32_t var_38h @ stack - 0x38
-var int32_t var_34h @ stack - 0x34
+var int32_t var_3ch @ stack - 0x3c
+var pid_t pid @ stack - 0x38
+var const char *v2 @ stack - 0x34
 var int32_t var_30h @ stack - 0x30
-var pid_t pid @ stack - 0x2c
-var const char *v2 @ stack - 0x28
+var const char *src @ stack - 0x2c
+var int32_t var_28h @ stack - 0x28
 var int32_t var_24h @ stack - 0x24
-var const char *src @ stack - 0x20
-var int32_t var_1ch @ stack - 0x1c
+var const char *s @ stack - 0x20
+var const char *var_1ch @ stack - 0x1c
 var int32_t var_18h @ stack - 0x18
-var const char *s @ stack - 0x14
-var const char *var_10h @ stack - 0x10
-var int32_t var_ch @ stack - 0xc
-var int32_t var_8h @ stack - 0x8
-var const char *option @ stack - 0x4
+var int32_t var_14h @ stack - 0x14
+var const char *option @ stack - 0x10
 arg int argc @ r0
 arg char **argv @ r1

Copy link
Member

@XVilka XVilka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All nice changes, but there are couple nitpicks.
Capstone-sys fails:

 FAILED: librz/asm/librz_asm.so.0.7.0.p/p_asm_arm_cs.c.o 
gcc -Ilibrz/asm/librz_asm.so.0.7.0.p -I. -I.. -Ilibrz -I../librz -Ilibrz/include -I../librz/include -I../librz/asm/arch/include -I../librz/asm/arch -I../librz/asm/arch/h8300 -I../librz/asm/arch/hexagon -I../librz/asm/arch/msp430 -I../librz/asm/arch/rsp -I../librz/asm/arch/mcore -I../librz/asm/arch/v850 -I../librz/asm/arch/propeller -I../librz/asm/arch/ebc -I../librz/asm/arch/cr16 -I../librz/asm/arch/8051 -I../librz/asm/arch/v810 -I../librz/asm/arch/or1k -I../librz/asm/arch/tricore -Ilibrz/util/sdb/src -I../librz/util/sdb/src -I../librz/bin/format -Isubprojects/rzspp -I../subprojects/rzspp -I/usr/include/capstone -fdiagnostics-color=always -D_FILE_OFFSET_BITS=64 -Wall -Winvalid-pch -Werror -O3 -Wimplicit-fallthrough=3 -DRZ_PLUGIN_INCORE=1 -D_GNU_SOURCE --std=gnu99 -Werror=sizeof-pointer-memaccess -fvisibility=hidden -Wno-cpp -fPIC -MD -MQ librz/asm/librz_asm.so.0.7.0.p/p_asm_arm_cs.c.o -MF librz/asm/librz_asm.so.0.7.0.p/p_asm_arm_cs.c.o.d -o librz/asm/librz_asm.so.0.7.0.p/p_asm_arm_cs.c.o -c ../librz/asm/p/asm_arm_cs.c
../librz/asm/p/asm_arm_cs.c: In function ‘disassemble’:
../librz/asm/p/asm_arm_cs.c:134:49: error: implicit declaration of function ‘CS_AARCH64pre’ [-Werror=implicit-function-declaration]
  134 |                 ret = (a->bits == 64) ? cs_open(CS_AARCH64pre(CS_ARCH_), mode, &ctx->cd) : cs_open(CS_ARCH_ARM, mode, &ctx->cd);
      |                                                 ^~~~~~~~~~~~~
../librz/asm/p/asm_arm_cs.c:134:63: error: ‘CS_ARCH_’ undeclared (first use in this function); did you mean ‘CS_ARCH_ALL’?
  134 |                 ret = (a->bits == 64) ? cs_open(CS_AARCH64pre(CS_ARCH_), mode, &ctx->cd) : cs_open(CS_ARCH_ARM, mode, &ctx->cd);
      |                                                               ^~~~~~~~
      |                                                               CS_ARCH_ALL
../librz/asm/p/asm_arm_cs.c:134:63: note: each undeclared identifier is reported only once for each function it appears in
cc1: all warnings being treated as errors

Please also rebase on top of the latest dev.
Apart from that - LGTM.

@@ -213,6 +214,8 @@ RZ_API bool rz_analysis_op_ismemref(int t) {
case RZ_ANALYSIS_OP_TYPE_STORE:
case RZ_ANALYSIS_OP_TYPE_LEA:
case RZ_ANALYSIS_OP_TYPE_CMP:
case RZ_ANALYSIS_OP_TYPE_POP:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice finding!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are probably more of those little in-precise implementations. IMHO the problem is, we miss an abstract concept of stack frames for static analysis.

It's all spread around some struct members and functions like those (all of which miss documentation, so variables like RzAnalysisOp->stackptr are used differently between x86 and other archs). All of which is somehow baked together in var.c.

subprojects/capstone-next.wrap Outdated Show resolved Hide resolved
source_filename = 4.0.2.tar.gz
source_hash = 7c81d798022f81e7507f1a60d6817f63aa76e489aa4e7055255f21a22f5e526a
[wrap-git]
url = https://github.com/capstone-engine/capstone.git
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the plan for those? Asking capstone devs to make 4.0.3 and 5.0.2 releases?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, Though I think @kabeor wants to release it with the pre-release together?

@Rot127
Copy link
Member Author

Rot127 commented Dec 17, 2023

@XVilka Capstone-sys fails because the macros for meta-programming (CS_AARCH64() and the like) are not yet released and in the repos. Once this is the case, the test will succeed again.
We could add a header file with those macros until the new v5 and v4 release is in the repos?

@XVilka
Copy link
Member

XVilka commented Dec 17, 2023

@XVilka Capstone-sys fails because the macros for meta-programming (CS_AARCH64() and the like) are not yet released and in the repos. Once this is the case, the test will succeed again. We could add a header file with those macros until the new v5 and v4 release is in the repos?

Adding a header is fine. Otherwise, we would have to wait for months or years.

@XVilka
Copy link
Member

XVilka commented Dec 17, 2023

@Rot127 please rebase it too

@XVilka
Copy link
Member

XVilka commented Dec 17, 2023

@XVilka
Copy link
Member

XVilka commented Dec 18, 2023

@Rot127 some error still persists:

./librz/analysis/arch/arm/arm_il64.c(2099): error C2121: '#': invalid character: possibly the result of a macro expansion
../librz/analysis/arch/arm/arm_il64.c(2099): error C2059: syntax error: 'if'
../librz/analysis/arch/arm/arm_il64.c(2109): warning C4003: not enough arguments for function-like macro invocation 'CS_aarch64'
../librz/analysis/arch/arm/arm_il64.c(2117): warning C4003: not enough arguments for function-like macro invocation 'CS_aarch64'

Copy link
Member

@XVilka XVilka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feel free to squash-merge this PR with a good commit message.

Ignore the red OpenBSD, it's timeout:

[XX] TIMEOUT db/rzil/arm32 emulateme_vfp
RZ_NOPLUGINS=1 /home/build/bin/rizin -escr.utf8=0 -escr.color=0 -escr.interactive=0 -eflirt.sigdb.load.system=false -eflirt.sigdb.load.home=false -N -Qc 'o malloc://0x1000 0x40000
o malloc://0x10 0x50000
# `o` resets the cpu for analysis.cpu
s main
e asm.bits=32
e asm.cpu=v8
aezi
e io.cache=1
ar sp=0x41000
aezsu 0x10548 # after vsub
ar d7
aezsu 0x10554 # after vmul
ar d7
aezs # after vadd
ar d7
aezs # after vcvt
ar d16
aezsu 0x1058c # before printf
ar r2
ar r3
pxq 16 @ sp
' bins/elf/emulateme_vfp.arm32
-- exit status: -1

The rest of CI is green. Good job!

This commit refactors the AArch64 and partially the ARM
plugin to make it Capstone v6 compatible.
Due to the big API changes in Capstone v6 several changes
had to be made.

Because we need to be compatible to Capstone v4 and v5
many include guards are added as well.

Overview of changes done:

**ARM**

- Instruction alias were introduced. This leads to different
decoding and analysis paths taken for certain instructions.
Some alias have their IL code generated like the real
instruction now (no special handling needed anymore).
This change is responsible for many changes you'll encounter.

- The operand details of each instruction are now always the
one of the real instruction. Also for alias.
For example, if "MOV <Wd>, #<imm>" is an alias for `ORR <Wd>, WZR, #<imm>`,
the details by CS hold all three operands of "ORR".
Before, they held only the two of "MOV".

- Several bugs in variable and argument generation
were fixed. Especially the default variable width
for ARM Thumb was changed to 32bit instead of the 16bit.

**AArch64/ARM64**

The changes listed above for ARM, also apply to AArch64.

Additionally:

- Capstone v6 changed the name ARM64 now everywhere to AArch64.
To be compatible with Capstone v4/v5 AArch64 names must be wrapped
into macros which resolve the name, depending on the CS version used.

- Capstone v6 is now more consistent with
register real and alias names.
From now on we use the register alias by default.

**List squashed commit messages:**

[AArch64 CS v6 BEGIN] Change subproject config to use cs-auto-sync-aarch64 branch

Replace ARM64 with version sensitive macros.

Exclude alias if CS version >= 6

Update access to writeback member

Exclude instr alias from inclusion

Update memory operand printing to json.

Enable real instr. detail only for AArch64

Set correct arch name in meson.build for CS

Fix U/SBFM instructions and their alias.

Mark parameters with RZ_OUt/BORROW

Optimize register extension to skip some, if the width already matches.

Adapt width and lsb of U/SBFM alias instructions (ImmR and ImmS are from U/SBFM).

Fix tests correct semantic buy bad syntax

Pass alias MOV instructions to mov()

Handle CSET and CSETM alias

Fix lsl, lsr and asr by handling them as alias.

Fix mov alias.

Handle TST alias

Fix CNEG, CINV alias

Fix bfi and bfxil alias.

Fix sign extensions.

Fix compare instructions.

Fix NEG, NGC, NGCS, NEGS, MVN

Fix CINC

Fix multiply instructions.

Fix ROR

Run clang-format

Handle CMP for ESIL

Handle new position of memory disponents of post index operands.

Fix post-index operations.

Add missing writeback checks for Post and preindex

Handle UBFM and SBFM alias

Handl BFM alias

Handle CMP, CSET and CINC alias

Update meson file of for cs-aarch64 branch

Fix asm tests. Use reg alias now.

Fix condition confusion and incorrect operand usage.

Fix plf test.

Run clang-format

Use register alias in tests

Add support for fp and lr reg alias assembly.

Use reg alias in test

Rename cond tranlate functions r2 -> rz

Fix condition check which assume 0 == invalid.

Fix issues intruduced by rebase

Set CS commit to current next branch.

Rename ARM64 -> AArch64

Add missing source file to meson.build

Remove DisassemblerExtension.c file for CS v5

Update to newest capstone next branch

Bump up CS version

REVERT ME: Get Capstone v4/v5 via git clone until new tars are released.

Wrap setting of CS_DETAIL_REAL into CS version check

Add maybe-unitialized to Capstone C args.

Fix CS pre v6 build by adding guards.

Use reg alias now printed by default by CS.

Bump CS version to most recent next.

Fix build errors due to stircter alias handling in ARM.

Fix RzIL tests introduced by alias introduction to ARM.

Fix ESIL bugs introduced with ARM alias introduction.

- stackptr hasn't been set for POP and PUSH

Add support again for Thumb1 pop/push

Handle PUSHW and POPW alias

Update test case

Add more POP and PUSH alias and enrich detail for other versions of them.

Fix incorrect mem access width guesses for ARM thumb.

Set POP return info if it writes to PC

Fix tests about default var size and POP mem write direction.

Bump CS version to newest next.

Fix incorrect tests.

- TriCore: Functions were in ro section.
- Default arg width in ARM thumb is 32bit.

Revert check for a set stackptr.

stackptr is used in different ways:
1. Safes the offset from the stack frame base.
2. Is interpreted as somthing else for x86 and I cannot find out what, in a reasonably time.

Hence we cannot use it here consistently.

Remove check for non existing ARM_GRP_RET in CSv5

Fix incorrect stack offsets of variables.

'push <reg-list>' instructions for which the second register was the FP,
reset the stackptr variable to 0. This led to wrong bp offsets in the variable names.
In this case it was +0xc.

Bump CS version.

Add copy of meta-programming macros for capstone-sys build.

Update capstone-next.wrap

Use bracket-less met-programming macro to fix Windows build warnings.

Update wrap files for Capstone with branch names

Add new meta-programming macro

Add workaround for MSVC pre-processor bug.
@XVilka XVilka merged commit 9b227eb into dev Dec 19, 2023
56 of 57 checks passed
@XVilka XVilka deleted the dist-capstone-v6-aarch64 branch December 19, 2023 17:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Pointer authentication-related instructions parsed as "invalid"
4 participants