[AArch64] Variety of incorrect disassembly info (mainly SVE) #2472

FinnWilkinson · 2024-09-06T12:39:28Z

Work environment

Questions	Answers
OS/arch/bits	MacOS AArch64
Architecture	armv8
Source of Capstone	`git clone`
Version/git commit	`df72286`

Sorry for not using the template fully, and sorry in advance for the long issue, but I have identified a fair few instructions (mainly SVE) with incorrect access types, and others with incorrect implicit register reads / writes or no immidiate encoding.

Incorrect implicit destinations

Opcode AArch64_MRS (0x42d03bd5) currently has the NZCV register as an implicit write register. This isn't correct
Opcode AArch64_BLR (0x20003fd6) has an implicit read of SP. Again this isn't correct.

Incorrect access permissions

For this, for SVE instructions which have the format of fsub zdn, pg, zdn, zm - that is, where the 1st and 2nd z-regs must be the same - both operands have their access set to READ | WRITE.

Although this is technically correct, as register zdn is being written to and read from, I think it can be confusing. My reasoning for this is that operand[0] represents the register being written to, so its access should be just WRITE. Then operand[2] (in the example above) is the first source vector and should be READ.
If someone does not know that the destination vector register and the first source vector are mandated by the ISA spec to be the same register, then it could be confusing to see 2 registers being written to.

Here is an examples of this occuring:

 0  20 00 19 04  eor	z0.b, p0/m, z0.b, z1.b
	ID: 311 (eor)
	op_count: 4
		operands[0].type: REG = z0
		operands[0].access: READ | WRITE
			Vector Arrangement Specifier: 0x8
		operands[1].type: PREDICATE
		operands[1].pred.reg: p0
		operands[1].access: READ
		operands[2].type: REG = z0
		operands[2].access: READ | WRITE
			Vector Arrangement Specifier: 0x8
		operands[3].type: REG = z1
		operands[3].access: READ
			Vector Arrangement Specifier: 0x8
	Write-back: True
	Registers read: z0 p0 z1
	Registers modified: z0
	Groups: HasSVEorSME

Which could be instead this:

 0  20 00 19 04  eor	z0.b, p0/m, z0.b, z1.b
	ID: 311 (eor)
	op_count: 4
		operands[0].type: REG = z0
		operands[0].access: WRITE
			Vector Arrangement Specifier: 0x8
		operands[1].type: PREDICATE
		operands[1].pred.reg: p0
		operands[1].access: READ
		operands[2].type: REG = z0
		operands[2].access: READ
			Vector Arrangement Specifier: 0x8
		operands[3].type: REG = z1
		operands[3].access: READ
			Vector Arrangement Specifier: 0x8
	Write-back: True
	Registers read: z0 p0 z1
	Registers modified: z0
	Groups: HasSVEorSME

I don't have a comprehensive list of all instructions that are effected by this, but it generally seems to be SVE only and with the format Zdn, pg, zdn, <zm|#imm>.
Below is a list of opcode enums (and some bytecodes where I've made a note of them) of the ones I have run into so far:

AArch64_FSUB_ZPmI_D
AArch64_FSUB_ZPmI_H
AArch64_FSUB_ZPmI_S // Example bytecode - 00849965
AArch64_FMUL_ZPmI_D
AArch64_FMUL_ZPmI_H
AArch64_FMUL_ZPmI_S // Example bytecode - 00809a65
AArch64_FADD_ZPmI_D // Example bytecode - 0584d865
AArch64_FADD_ZPmI_H
AArch64_FADD_ZPmI_S
AArch64_AND_ZPmZ_D // Example bytecode - 4901da04
AArch64_AND_ZPmZ_H
AArch64_AND_ZPmZ_S
AArch64_AND_ZPmZ_B
AArch64_SMULH_ZPmZ_B // Example bytecode - 20001204
AArch64_SMULH_ZPmZ_D
AArch64_SMULH_ZPmZ_H
AArch64_SMULH_ZPmZ_S
AArch64_SMIN_ZPmZ_B
AArch64_SMIN_ZPmZ_D
AArch64_SMIN_ZPmZ_H
AArch64_SMIN_ZPmZ_S // Example bytecode - 01008a04
AArch64_SMAX_ZPmZ_B
AArch64_SMAX_ZPmZ_D
AArch64_SMAX_ZPmZ_H
AArch64_SMAX_ZPmZ_S // Example bytecode - 01008804
AArch64_MUL_ZPmZ_B // Example bytecode - 40001004
AArch64_MUL_ZPmZ_D
AArch64_MUL_ZPmZ_H
AArch64_MUL_ZPmZ_S
AArch64_FSUBR_ZPmZ_D
AArch64_FSUBR_ZPmZ_H
AArch64_FSUBR_ZPmZ_S // Example bytecode - 24808365
AArch64_FSUB_ZPmZ_D
AArch64_FSUB_ZPmZ_H
AArch64_FSUB_ZPmZ_S // Example bytecode - 24808165
AArch64_FMUL_ZPmZ_D
AArch64_FMUL_ZPmZ_H
AArch64_FMUL_ZPmZ_S // Example bytecode - 83808265
AArch64_FDIV_ZPmZ_D // Example bytecode - 0184cd65
AArch64_FDIV_ZPmZ_H
AArch64_FDIV_ZPmZ_S
AArch64_FDIVR_ZPmZ_D // Example bytecode - 0184cc65
AArch64_FDIVR_ZPmZ_H
AArch64_FDIVR_ZPmZ_S
AArch64_FADDA_VPZ_D
AArch64_FADDA_VPZ_H
AArch64_FADDA_VPZ_S // Example bytecode - 01249865
AArch64_FADD_ZPmZ_D // Example bytecode - 6480c065
AArch64_FADD_ZPmZ_H
AArch64_FADD_ZPmZ_S
AArch64_FCADD_ZPmZ_D // Example bytecode - 2080c064
AArch64_FCADD_ZPmZ_H
AArch64_FCADD_ZPmZ_S
AArch64_ADD_ZPmZ_B // Example bytecode - 00000004
AArch64_ADD_ZPmZ_D
AArch64_ADD_ZPmZ_H
AArch64_ADD_ZPmZ_S
AArch64_EOR_ZPmZ_B // Example bytecode - 20001904
AArch64_EOR_ZPmZ_D
AArch64_EOR_ZPmZ_H
AArch64_EOR_ZPmZ_S

Similar has also been seen with unpredicated SVE instructions where operand[0] and operand[1] must be the same SVE vector register:

AArch64_SMAX_ZI_B
AArch64_SMAX_ZI_D
AArch64_SMAX_ZI_H
AArch64_SMAX_ZI_S // Example bytecode - 03c0a825
AArch64_AND_ZI // Example bytecode - 00068005
AArch64_ADD_ZI_B // Example bytecode - 00c12025
AArch64_ADD_ZI_D
AArch64_ADD_ZI_H
AArch64_ADD_ZI_S

Incorrect access permissions pt. 2

There are some other instructions I have found with wrong access information.

AArch64_CASALX and AArch64_CASALW // Example bytecode - 02fce188

 0  02 fc e1 88  casal	w1, w2, [x0]
	ID: 127 (casal)
	op_count: 3
		operands[0].type: REG = w1
		operands[0].access: READ | WRITE
		operands[1].type: REG = w2
		operands[1].access: READ
		operands[2].type: MEM
			operands[2].mem.base: REG = x0
		operands[2].access: READ | WRITE
	Write-back: True
	Registers read: w1 w2 x0
	Registers modified: w1 x0
	Groups: HasLSE

All permissions should be READ as no register is updated with CASAL. Also writeback should be False:

 0  02 fc e1 88  casal	w1, w2, [x0]
	ID: 127 (casal)
	op_count: 3
		operands[0].type: REG = w1
		operands[0].access: READ
		operands[1].type: REG = w2
		operands[1].access: READ
		operands[2].type: MEM
			operands[2].mem.base: REG = x0
		operands[2].access: READ
	Registers read: w1 w2 x0
	Registers modified: w1 x0
	Groups: HasLSE

AArch64_FCVTNv4i32 // Example bytecode - 0168614e

 0  01 68 61 4e  fcvtn2	v1.4s, v0.2d
	ID: 367 (fcvtn2)
	op_count: 2
		operands[0].type: REG = q1 (vreg)
		operands[0].access: READ | WRITE
			Vector Arrangement Specifier: 0x420
		operands[1].type: REG = q0 (vreg)
		operands[1].access: READ
			Vector Arrangement Specifier: 0x240
	Write-back: True
	Registers read: fpcr q1 q0
	Registers modified: q1
	Groups: HasNEON

operands[0] should be WRITE only. More variants of this instruction may be effected, I just haven't verified this:

 0  01 68 61 4e  fcvtn2	v1.4s, v0.2d
	ID: 367 (fcvtn2)
	op_count: 2
		operands[0].type: REG = q1 (vreg)
		operands[0].access: WRITE
			Vector Arrangement Specifier: 0x420
		operands[1].type: REG = q0 (vreg)
		operands[1].access: READ
			Vector Arrangement Specifier: 0x240
	Write-back: True
	Registers read: fpcr q1 q0
	Registers modified: q1
	Groups: HasNEON

Imm not set when a shift is present

For many instructions that take an immidiate, a shift can also optionally be provided. When the shift is not provided, the instructions work fine.
However, when the shift is provided the shift amount is often fixed or in a range. As such, Capstone / LLVM disassembler automatically works out the shifted value. The shifted immidiate is given correctly in the operand string, but is not in the disassembly info.

Example: AArch64_CPY_ZPzI_H: // Example bytecode - 01215005

 0  01 21 50 05  mov	z1.h, p0/z, #0x800
	ID: 273 (cpy)
	Is alias: 1429 (mov) with REAL operand set
	op_count: 4
		operands[0].type: REG = z1
		operands[0].access: WRITE
			Vector Arrangement Specifier: 0x10
		operands[1].type: PREDICATE
		operands[1].pred.reg: p0
		operands[1].access: READ
		operands[2].type: IMM = 0x0
		operands[2].access: READ
		operands[3].type: IMM = 0x0
		operands[3].access: READ
	Registers read: p0
	Registers modified: z1
	Groups: HasSVEorSME

Here, there is an extra operand in operand[3], and the imm is not set:

 0  01 21 50 05  mov	z1.h, p0/z, #0x800
	ID: 273 (cpy)
	Is alias: 1429 (mov) with REAL operand set
	op_count: 4
		operands[0].type: REG = z1
		operands[0].access: WRITE
			Vector Arrangement Specifier: 0x10
		operands[1].type: PREDICATE
		operands[1].pred.reg: p0
		operands[1].access: READ
		operands[2].type: IMM = 0x800
		operands[2].access: READ
	Registers read: p0
	Registers modified: z1
	Groups: HasSVEorSME

An alternative assembly for this instruction (and the one I used to generate the bytecode) is cpy z1.h, p0/z, #8, lsl #8, where the only LSL available is by #8.

The instructions we have found to be effected are:

AArch64_ADD_ZI_B
AArch64_ADD_ZI_D
AArch64_ADD_ZI_H
AArch64_ADD_ZI_S
AArch64_CPY_ZPzI_B
AArch64_CPY_ZPzI_D
AArch64_CPY_ZPzI_H
AArch64_CPY_ZPzI_S

This issue is likely to effect all instructions which use immidiates and optional shifts in this way.

FP immidate not shown in disassembly information

For instructions which take a fixed floating point immidiate value, it is correctly identified that one exists, and the EXACTFPIMM field is populated. But, we also have the .fp field in the cs_aarch64_op union. It could be useful to also populate this field as well as the enum for better clarity and improved in-project usage.

Example: AArch64_FADD_ZPmI_D // Example bytecode - 0584d865

 0  05 84 d8 65  fadd	z5.d, p1/m, z5.d, #0.5
	ID: 332 (fadd)
	op_count: 4
		operands[0].type: REG = z5
		operands[0].access: READ | WRITE
			Vector Arrangement Specifier: 0x40
		operands[1].type: PREDICATE
		operands[1].pred.reg: p1
		operands[1].access: READ
		operands[2].type: REG = z5
		operands[2].access: READ | WRITE
			Vector Arrangement Specifier: 0x40
		operands[3].type: SYS IMM:
		operands[3].subtype EXACTFPIMM = 1
	Write-back: True
	Registers read: z5 p1
	Registers modified: z5
	Groups: HasSVEorSME

Could be

 0  05 84 d8 65  fadd	z5.d, p1/m, z5.d, #0.5
	ID: 332 (fadd)
	op_count: 4
		operands[0].type: REG = z5
		operands[0].access: READ | WRITE
			Vector Arrangement Specifier: 0x40
		operands[1].type: PREDICATE
		operands[1].pred.reg: p1
		operands[1].access: READ
		operands[2].type: REG = z5
		operands[2].access: READ | WRITE
			Vector Arrangement Specifier: 0x40
		operands[3].type: SYS IMM:
		operands[3].subtype EXACTFPIMM = 1
                 operands[3].fp = 0.5
	Write-back: True
	Registers read: z5 p1
	Registers modified: z5
	Groups: HasSVEorSME

Thanks in advance!

The text was updated successfully, but these errors were encountered:

Rot127 · 2024-09-06T13:03:26Z

I am so grateful you guys use AArch64 so early and are so thoroughly checking it. I cannot thank you enough! SVE and SME2 added so much more complexity, this part of the module will be the best tested of all Capstone :D

Now the comments to the issues (working backwards):

FP immediate issue

Good idea! Going to add and document it. But this means also that the system operand is moved out of the union of operands. But this is probably a good idea anyways. Maybe people have similar requests for other sys ops in the future.

Missing shifts

There seems something broken how the shift details are added. Will fix it.

All the other access issues and false impl. read/writes

Those are (almost always) faulty definitions in the LLVM files. Found many of those in the last months (especially for AArch64 :( ). Will patch it in our fork. So I really really appreciate that you went through the effort to list all effected instructions!

I will fix this separately to the ones above. Because they are the same problem category.

FinnWilkinson · 2024-09-06T14:27:13Z

I've been eagerly awaiting the LLVM 18 aarch64 update to Capstone for a while as I do research into SME, so will let you guys know if I run into anything else!

As for the access issues, wrong implicit read/write - I'm sure there are many more effected than the ones I've listed. If there isn't a universal fix then we will be sure to let you know as and when we find more!

FinnWilkinson · 2024-09-06T14:32:01Z

Also, an alternative to moving the sysop outside the union could be to add a fp field to aarch64_sysop_imm? Although this may be a little confusing?

Rot127 · 2024-09-06T14:44:28Z

I've been eagerly awaiting the LLVM 18 aarch64

Just to ensure we are on the same page. AArch64 on the next branch is already based on LLVM-18. ARM not though.

Although this may be a little confusing?

Yeah. Minimizing memory usage with enabled details is out of scope currently. There was some thinking about an v2 API which tries to save memory were possible. But this is something for v7.

Rot127 · 2024-09-07T11:46:08Z

Opcode AArch64_MRS (0x42d03bd5) currently has the NZCV register as an implicit write register. This isn't correct

LLVM doesn't distinguish apparently between MSR instruction which set it and which not:

  // The MRS is set as a NZCV setting instruction. Not all MRS instructions
  // require doing this. The alternative was to explicitly model each one, but
  // it feels like it is unnecessary because it seems there are no negative
  // consequences setting these flags for all.
  let Defs = [NZCV];

I removed it, because we are more often right than wrong. But are you aware of any way to test this?

Rot127 · 2024-09-07T14:23:31Z

Some more comments regarding the flawed access information:

casal

The memory operand should be RW. The access property is in respect to the memory operand, not for the single register. And CASAL does write (conditionally) to the memory, so the last on is correct.
Write-back and the source register are wrong of course.

fcvtn2

Weird one. LLVM defines them with the incorrect instruction format, as a three-vector instruction instead as a two vector instruction.
So the output operand is also marked as input.

eor and the others

Fixing this, requires to track which MCOperands (LLVM operands) are used for which Capstone operand. This is not yet implemented (sorry, forgot to put it onto the todo list).

As described in the chat, currently we check for tied operands and their access attributes.
If two operands are tied (are physically the same, but logical distinct in the disassembler), we set both access flags.

This is necessary, because write back registers are given the RW property like this:

For a register which is read and written there are effectively two distinct logical operands for it. One "invisible" "written" register (which is never printed into the asm text) and the register which is printed/added to the details (and marked as read).

Because of this, registers which use the same register twice, once for reading and once for writing AND have the name twice in the asm text, now are marked both as READ_WRITE.

Implementing this would also fix the issue that the memory can be written, but none of the operands forming the address is. But up until now, the registers nonetheless appear in the regs_write list.

I would fix (partly) the casal instruction before the Alpha release. But the others take too much time.

eor and others require new code and fcvtn2 is so annoyingly defined that I need to make sure to not break something accidentally by changing the definition.

~~Especially fcvtn2 is a lot of effort, because I need to cross check with the ISA and the (not clearly defined) formats all the time.~~

-> Seems to have happened because this specific one is added with the 2023 extension. There are others which are correct. But Id like to move it as well after the Alpha release and fix it in a patch release.

Do you have experience with TableGen? Maybe you could fix it faster? Search for "fcvtn" (with ") in AArch6464InstrInfo.td. There are multiple definitions.

Sorry that they are not so quick fixes. I will list them as bugs in the release guide for now.

FinnWilkinson · 2024-09-09T11:05:36Z

RE MRS and NZCV, I'm not aware of any MRS instruction that sets the NZCV and I couldn't find anything in the spec either...

For casal I was unaware that the MEM operand access permission denoted if it was a READ or WRITE (or both) to memory! In which case I agree with what you have said.

As for the rest, thanks for explaining how it all works as it gives us a better understanding of how these issues have come about. For our use case we have work arounds, so if it will take a while to formulate a fix then that is fine. We are happy to log these errors so that the project in the long run can be made better.

I unfortunately have no experience with TableGen.... nor the capacity at this time to learn how to implement a fix (sorry!). I appriciate your effort into this and do nto mind too much non-quick fixes given the good communication around why and when (roughly) it can be expected

Rot127 · 2024-09-09T12:39:02Z

We are happy to log these errors so that the project in the long run can be made better.

Especially for AArch64 I would be super grateful if you could continue with these very detailed bug reports. Testing of the details was basically none existent up until now. And AArch64 has something around 100+ unique operand types. So we catch up slowly to get to 100% test coverage over the next releases.

FinnWilkinson · 2024-10-02T09:01:04Z

~~To add (another) bug to the above to add to the todo list for post-Alpha, AArch64_ADD_ZI_B type instructions have the imm set weirdly (example bytecode 00c12025:~~

 0  cb 3e 80 05  and	z11.b, z11.b, #0xfe
	ID: 32 (and)
	Is alias: 1408 (and) with REAL operand set
	op_count: 3
		operands[0].type: REG = z11
		operands[0].access: READ | WRITE
			Vector Arrangement Specifier: 0x40
		operands[1].type: REG = z11
		operands[1].access: READ | WRITE
			Vector Arrangement Specifier: 0x40
		operands[2].type: IMM = 0xfefefefefefefefe
		operands[2].access: READ
	Write-back: True
	Registers read: z11
	Registers modified: z11
	Groups: HasSVEorSME

~~Here, operands[2].type: IMM should be 0xfe not 0xfefefefefefefefe.~~

I'm currently re-implementing the latest version of next into our project today, so I may have a few more bug finds which I'll add here too

Rot127 · 2024-10-02T09:29:49Z

Please open a new issue if you find more. So this one gets not too long. Can be again a collection issue.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[AArch64] Variety of incorrect disassembly info (mainly SVE) #2472

[AArch64] Variety of incorrect disassembly info (mainly SVE) #2472

FinnWilkinson commented Sep 6, 2024

Rot127 commented Sep 6, 2024

FinnWilkinson commented Sep 6, 2024 •

edited

Loading

FinnWilkinson commented Sep 6, 2024 •

edited

Loading

Rot127 commented Sep 6, 2024

Rot127 commented Sep 7, 2024

Rot127 commented Sep 7, 2024 •

edited

Loading

FinnWilkinson commented Sep 9, 2024

Rot127 commented Sep 9, 2024

FinnWilkinson commented Oct 2, 2024 •

edited

Loading

This comment was marked as resolved.

This comment was marked as resolved.

Rot127 commented Oct 2, 2024 •

edited

Loading

[AArch64] Variety of incorrect disassembly info (mainly SVE) #2472

[AArch64] Variety of incorrect disassembly info (mainly SVE) #2472

Comments

FinnWilkinson commented Sep 6, 2024

Work environment

Incorrect implicit destinations

Incorrect access permissions

Incorrect access permissions pt. 2

AArch64_CASALX and AArch64_CASALW // Example bytecode - 02fce188

AArch64_FCVTNv4i32 // Example bytecode - 0168614e

Imm not set when a shift is present

FP immidate not shown in disassembly information

Rot127 commented Sep 6, 2024

FinnWilkinson commented Sep 6, 2024 • edited Loading

FinnWilkinson commented Sep 6, 2024 • edited Loading

Rot127 commented Sep 6, 2024

Rot127 commented Sep 7, 2024

Rot127 commented Sep 7, 2024 • edited Loading

FinnWilkinson commented Sep 9, 2024

Rot127 commented Sep 9, 2024

FinnWilkinson commented Oct 2, 2024 • edited Loading

This comment was marked as resolved.

This comment was marked as resolved.

Rot127 commented Oct 2, 2024 • edited Loading

FinnWilkinson commented Sep 6, 2024 •

edited

Loading

FinnWilkinson commented Sep 6, 2024 •

edited

Loading

Rot127 commented Sep 7, 2024 •

edited

Loading

FinnWilkinson commented Oct 2, 2024 •

edited

Loading

Rot127 commented Oct 2, 2024 •

edited

Loading