[RISCV][MC] Fix >32bit .insn Directives #111878

lenary · 2024-10-10T17:52:32Z

The original patch had a reasonably significant bug. You could not use .insn to assemble encodings that had any bits set above the low 32 bits. This is due to the fact that getMachineOpValue was truncating the immediate value, and I did not commit enough tests of useful cases.

This changes the result of getMachineOpValue to be able to return the 48-bit and 64-bit immediates needed for the wider .insn directives.

I took the opportunity to move some of the test cases around in the file to make looking at the output of llvm-objdump a little clearer.

The original patch had a reasonably significant bug. You could not use `.insn` to assemble encodings that had any bits set above the low 32 bits. This is due to the fact that `getMachineOpValue` was truncating the immediate value, and I did not commit enough tests of useful cases. This changes the result of `getMachineOpValue` to be able to return the 48-bit and 64-bit immediates needed for the wider `.insn` directives. I took the opportunity to move some of the test cases around in the file to make looking at the output of `llvm-objdump` a little clearer.

llvmbot · 2024-10-10T17:52:54Z

@llvm/pr-subscribers-mc

@llvm/pr-subscribers-backend-risc-v

Author: Sam Elliott (lenary)

Changes

The original patch had a reasonably significant bug. You could not use .insn to assemble encodings that had any bits set above the low 32 bits. This is due to the fact that getMachineOpValue was truncating the immediate value, and I did not commit enough tests of useful cases.

This changes the result of getMachineOpValue to be able to return the 48-bit and 64-bit immediates needed for the wider .insn directives.

I took the opportunity to move some of the test cases around in the file to make looking at the output of llvm-objdump a little clearer.

Full diff: https://github.com/llvm/llvm-project/pull/111878.diff

2 Files Affected:

(modified) llvm/lib/Target/RISCV/MCTargetDesc/RISCVMCCodeEmitter.cpp (+3-3)
(modified) llvm/test/MC/RISCV/insn.s (+29-6)

diff --git a/llvm/lib/Target/RISCV/MCTargetDesc/RISCVMCCodeEmitter.cpp b/llvm/lib/Target/RISCV/MCTargetDesc/RISCVMCCodeEmitter.cpp
index 66970ed37f2724..54f1a3899c4957 100644
--- a/llvm/lib/Target/RISCV/MCTargetDesc/RISCVMCCodeEmitter.cpp
+++ b/llvm/lib/Target/RISCV/MCTargetDesc/RISCVMCCodeEmitter.cpp
@@ -77,7 +77,7 @@ class RISCVMCCodeEmitter : public MCCodeEmitter {
 
   /// Return binary encoding of operand. If the machine operand requires
   /// relocation, record the relocation and return zero.
-  unsigned getMachineOpValue(const MCInst &MI, const MCOperand &MO,
+  uint64_t getMachineOpValue(const MCInst &MI, const MCOperand &MO,
                              SmallVectorImpl<MCFixup> &Fixups,
                              const MCSubtargetInfo &STI) const;
 
@@ -375,7 +375,7 @@ void RISCVMCCodeEmitter::encodeInstruction(const MCInst &MI,
   ++MCNumEmitted; // Keep track of the # of mi's emitted.
 }
 
-unsigned
+uint64_t
 RISCVMCCodeEmitter::getMachineOpValue(const MCInst &MI, const MCOperand &MO,
                                       SmallVectorImpl<MCFixup> &Fixups,
                                       const MCSubtargetInfo &STI) const {
@@ -384,7 +384,7 @@ RISCVMCCodeEmitter::getMachineOpValue(const MCInst &MI, const MCOperand &MO,
     return Ctx.getRegisterInfo()->getEncodingValue(MO.getReg());
 
   if (MO.isImm())
-    return static_cast<unsigned>(MO.getImm());
+    return MO.getImm();
 
   llvm_unreachable("Unhandled expression!");
   return 0;
diff --git a/llvm/test/MC/RISCV/insn.s b/llvm/test/MC/RISCV/insn.s
index e32fec25bb16b4..d24f4fe8b36374 100644
--- a/llvm/test/MC/RISCV/insn.s
+++ b/llvm/test/MC/RISCV/insn.s
@@ -170,17 +170,40 @@ target:
 # CHECK-OBJ: <unknown>
 .insn 6, 0x1f
 
-# CHECK-ASM: .insn 0x4, 65503
-# CHECK-ASM: encoding: [0xdf,0xff,0x00,0x00]
-# CHECK-OBJ: <unknown>
-.insn 0xffdf
-
 # CHECK-ASM: .insn 0x8, 63
 # CHECK-ASM: encoding: [0x3f,0x00,0x00,0x00,0x00,0x00,0x00,0x00]
 # CHECK-OBJ: <unknown>
 .insn 8, 0x3f
 
+# CHECK-ASM: .insn 0x6, 281474976710623
+# CHECK-ASM: encoding: [0xdf,0xff,0xff,0xff,0xff,0xff]
+# CHECK-OBJ: <unknown>
+.insn 0x6, 0xffffffffffdf
+
+# CHECK-ASM: .insn 0x8, -65
+# CHECK-ASM: encoding: [0xbf,0xff,0xff,0xff,0xff,0xff,0xff,0xff]
+# CHECK-OBJ: <unknown>
+.insn 0x8, 0xffffffffffffffbf
+
+odd_lengths:
+# CHECK-ASM-LABEL: odd_lengths:
+# CHECK-OBJ-LABEL: <odd_lengths>:
+
+## These deliberately disagree with the lengths objdump expects them to have, so
+## keep them at the end so that the disassembled instruction stream is not out
+## of sync with the encoded instruction stream. We don't check for `<unknown>`
+## as we could get any number of those, so instead check for the encoding
+## halfwords. These might be split into odd 16-bit chunks, so each chunk is on
+## one line.
+
+# CHECK-ASM: .insn 0x4, 65503
+# CHECK-ASM: encoding: [0xdf,0xff,0x00,0x00]
+# CHECK-OBJ: ffdf
+# CHECK-OBJ: 0000
+.insn 0xffdf
+
 # CHECK-ASM: .insn 0x4, 65471
 # CHECK-ASM: encoding: [0xbf,0xff,0x00,0x00]
-# CHECK-OBJ: <unknown>
+# CHECK-OBJ: ffbf
+# CHECK-OBJ: 0000
 .insn 0xffbf

lenary · 2024-10-10T17:53:28Z

🤦‍♂️ This was a really dumb bug, sorry.

asb · 2024-10-10T18:19:49Z

🤦‍♂️ This was a really dumb bug, sorry.

This reviewer didn't catch it either 🤦‍♂️

Change LGTM.

The original patch had a reasonably significant bug. You could not use `.insn` to assemble encodings that had any bits set above the low 32 bits. This is due to the fact that `getMachineOpValue` was truncating the immediate value, and I did not commit enough tests of useful cases. This changes the result of `getMachineOpValue` to be able to return the 48-bit and 64-bit immediates needed for the wider `.insn` directives. I took the opportunity to move some of the test cases around in the file to make looking at the output of `llvm-objdump` a little clearer.

lenary added the backend:RISC-V label Oct 10, 2024

lenary requested review from asb, apazos, topperc and svs-quic October 10, 2024 17:52

llvmbot added the mc Machine (object) code label Oct 10, 2024

asb approved these changes Oct 10, 2024

View reviewed changes

lenary mentioned this pull request Oct 10, 2024

[RISCV] Support Expressions in .insn Directives #111893

Merged

lenary merged commit b5ea5be into llvm:main Oct 11, 2024
11 checks passed

lenary deleted the pr/64-48-insn-with-upper-bits branch October 11, 2024 11:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RISCV][MC] Fix >32bit .insn Directives #111878

[RISCV][MC] Fix >32bit .insn Directives #111878

lenary commented Oct 10, 2024

llvmbot commented Oct 10, 2024 •

edited

Loading

lenary commented Oct 10, 2024

asb commented Oct 10, 2024

[RISCV][MC] Fix >32bit .insn Directives #111878

[RISCV][MC] Fix >32bit .insn Directives #111878

Conversation

lenary commented Oct 10, 2024

llvmbot commented Oct 10, 2024 • edited Loading

lenary commented Oct 10, 2024

asb commented Oct 10, 2024

llvmbot commented Oct 10, 2024 •

edited

Loading