From d4af6f2559a6df93a5567fee616b6b86e7efd4df Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Eymen=20=C3=9Cnay?= Date: Sun, 5 May 2024 21:02:01 +0300 Subject: [PATCH 01/13] Fix the typo in Operation Definition for R_ARM_REL32 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit R_ARM_REL32 relocation's operation is defined as ((S + A) | T) – P but an extra "|" is left in the current version. --- aaelf32/aaelf32.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/aaelf32/aaelf32.rst b/aaelf32/aaelf32.rst index 85aba1a..976448d 100644 --- a/aaelf32/aaelf32.rst +++ b/aaelf32/aaelf32.rst @@ -1767,7 +1767,7 @@ The following nomenclature is used for the operation: +---------+----------------------------------+------------+---------------+----------------------------------------+ | 2 | :code:`R_ARM_ABS32` | Static | Data | :code:`(S + A) | T` | +---------+----------------------------------+------------+---------------+----------------------------------------+ - | 3 | :code:`R_ARM_REL32` | Static | Data | :code:`((S + A) | T) | – P` | + | 3 | :code:`R_ARM_REL32` | Static | Data | :code:`((S + A) | T) – P` | +---------+----------------------------------+------------+---------------+----------------------------------------+ | 4 | :code:`R_ARM_LDR_PC_G0` | Static | Arm | :code:`S + A – P` | +---------+----------------------------------+------------+---------------+----------------------------------------+ From 7b22e8e7191325f8ed8be082575031f4b0b059c1 Mon Sep 17 00:00:00 2001 From: Kerry McLaughlin Date: Thu, 23 May 2024 14:28:06 +0000 Subject: [PATCH 02/13] Document a new SME support routine to query the current value of VG. --- aapcs64/aapcs64.rst | 35 +++++++++++++++++++++++++++++++++++ 1 file changed, 35 insertions(+) diff --git a/aapcs64/aapcs64.rst b/aapcs64/aapcs64.rst index 66a076b..a6a1527 100644 --- a/aapcs64/aapcs64.rst +++ b/aapcs64/aapcs64.rst @@ -2077,6 +2077,9 @@ support routines: ``__arm_tpidr2_restore`` Provides a simple way of restoring lazily-saved ZA data. +``__arm_get_current_vg`` + Provides a safe way to detect the current value of VG. + ``__arm_sme_state`` ^^^^^^^^^^^^^^^^^^^ @@ -2269,6 +2272,38 @@ a lazy save, with the subroutine having the following properties: * The only memory modified by the subroutine (if any) is stack memory below the incoming SP. +``__arm_get_current_vg`` +^^^^^^^^^^^^^^^^^^^^^^^^ + +**(Beta)** + +Platforms that support SME must provide a subroutine to query the current +value of VG, with the subroutine having the following properties: + +* The subroutine is called ``__arm_get_current_vg``. + +* The subroutine has a `private-ZA`_ `streaming-compatible interface`_ with the + following properties: + + * X1-X15, X19-X29 and SP are call-preserved. + * Z0-Z31 are call-preserved. + * P0-P15 are call-preserved. + * the subroutine `preserves ZA`_. + +* The subroutine does not take any arguments. + +* The subroutine returns an unsigned double word in X0. + +* The subroutine behaves as follows: + + * If the current thread has access to FEAT_SME and PSTATE.SM is 1, the + subroutine returns the value of the streaming VG in X0. + + * Otherwise, if the current thread has access to FEAT_SVE, the subroutine + returns the value of VG in X0. + + * Otherwise, the subroutine returns the value 0 in X0. + Pseudo-code examples ==================== From 5c0a393f50e083ccca9f32ca94995211141fe858 Mon Sep 17 00:00:00 2001 From: Kerry McLaughlin Date: Tue, 28 May 2024 09:35:17 +0000 Subject: [PATCH 03/13] Updated the change history table in aapcs64.rst. --- aapcs64/aapcs64.rst | 1 + 1 file changed, 1 insertion(+) diff --git a/aapcs64/aapcs64.rst b/aapcs64/aapcs64.rst index a6a1527..1c134f1 100644 --- a/aapcs64/aapcs64.rst +++ b/aapcs64/aapcs64.rst @@ -254,6 +254,7 @@ changes to the content of the document for that release. +------------+--------------------+------------------------------------------------------------------+ | | | - Change the status of the SME support from Alpha to Beta. | | | | - Add soft-float PCS variant. | +| | | - Add the __arm_get_current_vg SME support routine. | +------------+--------------------+------------------------------------------------------------------+ References From 99d4311dc3c1eb0e0850b1f2396af942ba008cce Mon Sep 17 00:00:00 2001 From: Peter Smith Date: Tue, 18 Jun 2024 09:21:52 +0100 Subject: [PATCH 04/13] [aapcs64] Clarify meaning of "it" when preserving z and p regs At least one community got confused as to whether it refered to the callee or the caller. Use subroutine instead of it to make it clear that we are referring to the same subroutine that takes z and p registers as arguments. Fixes https://github.com/ARM-software/abi-aa/issues/266 --- aapcs64/aapcs64.rst | 33 ++++++++++++++++++--------------- 1 file changed, 18 insertions(+), 15 deletions(-) diff --git a/aapcs64/aapcs64.rst b/aapcs64/aapcs64.rst index 1c134f1..776d98e 100644 --- a/aapcs64/aapcs64.rst +++ b/aapcs64/aapcs64.rst @@ -252,9 +252,10 @@ changes to the content of the document for that release. | 2023Q3 | 6\ :sup:`th` | In `Data Types`_ include _BitInt(N) in language mapping. | | | October 2023 | | +------------+--------------------+------------------------------------------------------------------+ -| | | - Change the status of the SME support from Alpha to Beta. | -| | | - Add soft-float PCS variant. | +| 2024Q2 | 18\ :sup:'th' | - Change the status of the SME support from Alpha to Beta. | +| | June 2024 | - Add soft-float PCS variant. | | | | - Add the __arm_get_current_vg SME support routine. | +| | | - Clarify use of it when preserving z and p registers. | +------------+--------------------+------------------------------------------------------------------+ References @@ -901,13 +902,14 @@ contents of a single Scalable Vector Type (see `Scalable vectors`_). That is, scalable vector register z0 is an extension of SIMD and Floating-Point register v0. -z0-z7 are used to pass scalable vector arguments to a subroutine, and to -return scalable vector results from a function. If a subroutine takes -at least one argument in scalable vector registers or scalable predicate -registers, or if it is a function that returns results in such registers, -it must ensure that the entire contents of z8-z23 are preserved across -the call. In other cases it need only preserve the low 64 bits of z8-z15, -as described in `SIMD and Floating-Point registers`_. +z0-z7 are used to pass scalable vector arguments to a subroutine, and +to return scalable vector results from a function. If a subroutine +takes at least one argument in scalable vector registers or scalable +predicate registers, or returns results in such regisers, the +subroutine must ensure that the entire contents of z8-z23 are +preserved across the call. In other cases it need only preserve the +low 64 bits of z8-z15, as described in `SIMD and Floating-Point +registers`_. Scalable Predicate Registers ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ @@ -918,12 +920,13 @@ the scalable vector registers are available (see `Scalable vector registers`_). Each register can store the contents of a Scalable Predicate Type (see `Scalable Predicates`_). -p0-p3 are used to pass scalable predicate arguments to a subroutine and -to return scalable predicate results from a function. If a subroutine takes -at least one argument in scalable vector registers or scalable predicate -registers, or if it is a function that returns results in such registers, -it must ensure that p4-p15 are preserved across the call. In other cases -it need not preserve any scalable predicate register contents. +p0-p3 are used to pass scalable predicate arguments to a subroutine +and to return scalable predicate results from a function. If a +subroutine takes at least one argument in scalable vector registers or +scalable predicate registers, or returns results in such registers, +the subroutine must ensure that p4-p15 are preserved across the +call. In other cases it need not preserve any scalable predicate +register contents. SME state --------- From 29f4a4d764bbda1cf00f9d95d387677c38a08a45 Mon Sep 17 00:00:00 2001 From: Peter Smith Date: Tue, 19 Mar 2024 09:18:03 +0000 Subject: [PATCH 05/13] [PAUTHABIELF64] Remove alternative ELF marking scheme No implementation is using the alternative marking scheme; take the opportunity to remove it, so that toolchains do not need to support it. The alternative marking scheme started as the default one back when the spec was written. At the time there were only prototype implementations for which signing schema compatibility and versioning weren't important. Over time Arm's ELF marking for SysV has evolved such that the preference for ELF executables and shared-libraries is to use GNU Program Properties. We intend to migrate to Build Attributes for relocatable objects, but until that change is made GNU Program Properties should be used in relocatable objects too. The alternative ELF marking scheme was preserved for backwards compatibility, however no current implementation needs that so it can be removed. --- pauthabielf64/pauthabielf64.rst | 46 +++++---------------------------- 1 file changed, 7 insertions(+), 39 deletions(-) diff --git a/pauthabielf64/pauthabielf64.rst b/pauthabielf64/pauthabielf64.rst index 65b2e1b..f0d66aa 100644 --- a/pauthabielf64/pauthabielf64.rst +++ b/pauthabielf64/pauthabielf64.rst @@ -236,7 +236,9 @@ changes to the content of the document for that release. | | | DT_AARCH64_VARIANT_PCS. | +------------+-----------------------------+------------------------------------------------------------------+ | 2024Q1 | 29\ :sup:`th` January 2024 | Update preferred ELF marking scheme to be GNU property based | - | 2023Q4 | 18\ :sup:`th` March 2024 | Update relocation codes to move out of private experiments space.| + | | 18\ :sup:`th` March 2024 | Update relocation codes to move out of private experiments space.| + | | 19\ :sup:`th` March 2024 | Remove alternative ELF marking scheme. No implementation is | + | | | using it. | +------------+-----------------------------+------------------------------------------------------------------+ References @@ -852,13 +854,6 @@ This document defines the core information that any ELF marking scheme must contain and the base compatibility model that uses that information. -The default ELF marking scheme uses the Program Property note format -defined in (`LINUX_ABI`_). An alternative encoding that uses a Arm -defined Note section called ``.note.AARCH64-PAUTH-ABI-tag`` is defined -for platforms that do not support Program Properties, or have legacy -binaries from earlier versions of this specification. This is described -in `Appendix Alternative ELF Marking Using SHT_NOTE section`_. - Core information ---------------- @@ -922,15 +917,10 @@ Base Compatibility Model A per-ELF file marking scheme permits a coarse way of reasoning about compatibility. -* All reasoning about compatibility is done using the `Core Information`_. - This permits an ELF file using the ``.note.gnu.property`` ELF marking to - be compared to an ELF file using the ``.note.AARCH64-PAUTH-ABI-tag`` ELF - marking. - -* If an ELF file contains multiple ELF markings of the `Core - Information`_, for example it contains both a ``.note.gnu.property`` - section and a ``.note.AARCH64-PAUTH-ABI-tag`` section, then all - must encode the same `Core Information`_. +* All reasoning about compatibility is done using the `Core + Information`_. This permits an ELF relocatable object file using + the ``.note.gnu.property`` ELF marking to be compared to an ELF file + using build attributes that encode the `Core Information`_. * The absence of any ELF marking means no information on how pointers are signed is available for this ELF file. When used in combination @@ -1282,25 +1272,3 @@ Some observations: * When not dynamic linking a static linker may choose to encode the pointer signing information in a custom encoding understood by the start-up code used. - -Appendix Alternative ELF Marking Using SHT_NOTE section -======================================================= - -A new section named ``.note.AARCH64-PAUTH-ABI-tag`` of type -``SHT_NOTE`` is defined. This section is structured as a note section -as documented in SCO-ELF_, and its attribute flag ``SHF_ALLOC`` must -be set. - -The ``namesz`` field shall be 4 - -The ``descsz`` field shall be 16. See ``desc`` below. - -The type field shall be ``NT_ARM_TYPE_PAUTH_ABI_TAG``, defined to the -value 1. - -The ``name`` field shall be the null-terminated string ``ARM``. - -The ``desc`` contain 2 64-bit words. With the first 64-bit word being -the ``platform identifier``, and the second 64-bit word being the -``version number``. Both of these form the information required in -`Core Information`_ above. From f1cbe13965e4781f513fd6191cebec3a722effce Mon Sep 17 00:00:00 2001 From: Peter Smith Date: Tue, 2 Apr 2024 18:22:09 +0100 Subject: [PATCH 06/13] [pauthabielf64] Fix typo in relocation name As pointed out in https://github.com/ARM-software/abi-aa/issues/253 the R_AARCH64_AUTH_GOT_LO12_NC is meant to be the AUTH variant of R_AARCH64_LD64_GOT_LO12_NC. As there is also a R_AARCH64_LD32_GOT_LO12_NC relocation rename the relocation to R_AARCH64_LD64_AUTH_GOT_LO12_NC. These relocations are in the appendix as we are currently expecting the GOT to be RELRO and unsigned in most signing schemas. --- pauthabielf64/pauthabielf64.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/pauthabielf64/pauthabielf64.rst b/pauthabielf64/pauthabielf64.rst index f0d66aa..134f411 100644 --- a/pauthabielf64/pauthabielf64.rst +++ b/pauthabielf64/pauthabielf64.rst @@ -1143,7 +1143,7 @@ The GOT entries must be relocated by AUTH variant dynamic relocations. | | | | check that –2\ :sup:`32` | | | | | <= X < 2\ :sup:`32` | +-------------+----------------------------------------+----------------------------------+--------------------------+ - | 0x811A | R\_AARCH64\_AUTH\_GOT\_LO12_NC | G(ENCD(GDAT(S + A))) | Set the LD/ST immediate | + | 0x811A | R\_AARCH64\_AUTH\_LD64\_GOT\_LO12_NC | G(ENCD(GDAT(S + A))) | Set the LD/ST immediate | | | | | field to bits [11:3] of | | | | | X. No overflow check; | | | | | check that X&7 = 0 | From 01f97f3dcba1640e7291d09ecb5ec71216f906b1 Mon Sep 17 00:00:00 2001 From: Peter Smith Date: Wed, 3 Apr 2024 08:54:36 +0100 Subject: [PATCH 07/13] [pauthabi64] Add note for RAARCH64_AUTH_GOT_ADD_LO12_NC There is no equivalent for this relocation in the standard ABI it is used by runtime code to calculate the address of a GOT slot so it can be used as one of the inputs to an authenticate instruction. Add a note that this matches up with the :got_auth_lo12: operator for future reference. Part of https://github.com/ARM-software/abi-aa/issues/253 --- pauthabielf64/pauthabielf64.rst | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/pauthabielf64/pauthabielf64.rst b/pauthabielf64/pauthabielf64.rst index 134f411..ea93071 100644 --- a/pauthabielf64/pauthabielf64.rst +++ b/pauthabielf64/pauthabielf64.rst @@ -1157,7 +1157,6 @@ The GOT entries must be relocated by AUTH variant dynamic relocations. | | | | value to bits [11:0] of | | | | | X. No overflow check. | +-------------+----------------------------------------+----------------------------------+--------------------------+ - .. raw:: pdf PageBreak @@ -1171,7 +1170,9 @@ is the PAuth ABI equivalent of ``R_AARCH64_RELATIVE``. The underlying calculation performed by the dynamic linker is the same, the only difference is that the resulting pointer is signed. The dynamic linker reads the signing schema from the contents of the place of the dynamic -relocation. +relocation. The ``R_AARCH64_AUTH_GOT_ADD_LO12_NC`` relocation is an +addition for the PAuth ABI and has no equivalent in (AAELF64_). It is +used with the ``:got_auth_lo12:`` operator on an add instruction. .. table:: Additional AUTH Dynamic relocations From f14c8ffc67360d4da822d99555c8f83f879a9128 Mon Sep 17 00:00:00 2001 From: Peter Smith Date: Mon, 22 Apr 2024 14:55:03 +0100 Subject: [PATCH 08/13] [PAUTHABIELF64] Add R_AARCH64_AUTH_GOT_ADR_PREL_LO21 relocation With the tiny code model and a signed GOT, an adr instruction is needed to get the address of the GOT entry for input to the authenication. For example: adr x8, :got_auth: symbol ldr x0, [x8] // Authenticate to get unsigned pointer autia x0, x8 The adr requires a new relocation code where there isn't a direct equivalent in the main ABI as there is not need to take the address of the GOT slot when no authentication is required. We define R_AARCH64_AUTH_GOT_ADR_PREL21_LO21 for this purpose following the naming convention of R__ADR_PREL_LO21. which is its closest equivalent. --- pauthabielf64/pauthabielf64.rst | 29 +++++++++++++++++++++++++---- 1 file changed, 25 insertions(+), 4 deletions(-) diff --git a/pauthabielf64/pauthabielf64.rst b/pauthabielf64/pauthabielf64.rst index ea93071..9ff188a 100644 --- a/pauthabielf64/pauthabielf64.rst +++ b/pauthabielf64/pauthabielf64.rst @@ -1069,7 +1069,7 @@ GOT entry as the modifier. The static linker must encode the signing schema into the GOT slot. AUTH variant dynamic relocations must be used for signed GOT entries. -Example Code to access a signed GOT entry +Example Code to access a signed GOT entry with the small code model: .. code-block:: asm @@ -1082,6 +1082,18 @@ Example Code to access a signed GOT entry In the example the :got_auth: and :got_auth_lo12: operators result in AUTH variant GOT generating relocations being used. +Example Code to access a signed GOT entry with the tiny code model: + +.. code-block:: asm + + adr x8, :got_auth: symbol + ldr x0, [x8] + // Authenticate to get unsigned pointer + autia x0, x8 + +Compared to the tiny code model without pointer authentication an +additonal adr is required to get the address of the GOT entry. + AUTH variant GOT Generating Relocations --------------------------------------- @@ -1157,6 +1169,11 @@ The GOT entries must be relocated by AUTH variant dynamic relocations. | | | | value to bits [11:0] of | | | | | X. No overflow check. | +-------------+----------------------------------------+----------------------------------+--------------------------+ + | 0x811D | R\AARCH64\_AUTH\_GOT\_ADR\_PREL\_LO21 | G(ENCD(GDAT(S + A))) - P | Set the immediate | + | | | | value to bits[20:0] of X;| + | | | | check that -2 :sup:`20` | + | | | | <= 2 :sup: `20` | + +-------------+----------------------------------------+----------------------------------+--------------------------+ .. raw:: pdf PageBreak @@ -1170,9 +1187,13 @@ is the PAuth ABI equivalent of ``R_AARCH64_RELATIVE``. The underlying calculation performed by the dynamic linker is the same, the only difference is that the resulting pointer is signed. The dynamic linker reads the signing schema from the contents of the place of the dynamic -relocation. The ``R_AARCH64_AUTH_GOT_ADD_LO12_NC`` relocation is an -addition for the PAuth ABI and has no equivalent in (AAELF64_). It is -used with the ``:got_auth_lo12:`` operator on an add instruction. +relocation. + +The ``R_AARCH64_AUTH_GOT_ADD_LO12_NC`` relocation is an addition for +the PAuth ABI and has no equivalent in (AAELF64_). It is + +The ``R_AARCH64_AUTH_GOT_ADR_PREL_LO21`` relocation is used with the +``:got_auth:`` operator on an adr instruction. .. table:: Additional AUTH Dynamic relocations From 11b87eef9646e5f1629bc82db0c224011ac466ac Mon Sep 17 00:00:00 2001 From: John McCall Date: Mon, 1 Jul 2024 11:40:02 -0400 Subject: [PATCH 09/13] [aapcs64] Round up to a multiple of 8, not just to 8 --- aapcs64/aapcs64.rst | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/aapcs64/aapcs64.rst b/aapcs64/aapcs64.rst index 776d98e..9c7df5b 100644 --- a/aapcs64/aapcs64.rst +++ b/aapcs64/aapcs64.rst @@ -1952,8 +1952,8 @@ For a caller, sufficient stack space to hold stacked argument values is assumed | | | | C.13 | | +-----------------------+----------------------------------------------------------------------------------------+ - | | The NSAA is rounded up to the larger of 8 or the Natural Alignment of the argument’s | - | | type. | + | | The NSAA is rounded up to the nearest multiple of the larger of 8 or the Natural | + | | Alignment of the argument’s type. | | C.14 | | +-----------------------+----------------------------------------------------------------------------------------+ | | If the argument is a composite type then the argument is copied to memory at the | From 1a448ed798d03a4143508580bb8dffc99c188d7d Mon Sep 17 00:00:00 2001 From: Simon Tatham Date: Mon, 1 Jul 2024 17:19:26 +0100 Subject: [PATCH 10/13] [AAELF64] Clarify how addends work in MOVZ, MOVK and ADRP. This brings AAELF64 into line with AAELF32, which already has a similar clarification for the MOVW+MOVT pair. For the instructions which shift their operand left (ADRP, and the shifted MOVZ and MOVK), if the relocation addend is taken from the input value of the immediate field, it is not treated as shifted. The rationale is that this allows a sequence of related instructions to consistently compute the same value (symbol + small offset), and cooperate to load that value into the target register, one small chunk at a time. For example, this would load `mySymbol + 0x123`: mov x0, #0x123 ; R_AARCH64_MOVW_UABS_G0_NC(mySymbol) movk x0, #0x123, lsl #16 ; R_AARCH64_MOVW_UABS_G1_NC(mySymbol) movk x0, #0x123, lsl #32 ; R_AARCH64_MOVW_UABS_G2_NC(mySymbol) movk x0, #0x123, lsl #48 ; R_AARCH64_MOVW_UABS_G3(mySymbol) The existing text made it unclear whether the addends were shifted or not. If they are interpreted as shifted, then nothing useful happens, because the first instruction would load the low 16 bits of `mySymbol+0x123`, and the second would load the next 16 bits of `mySymbol+0x1230000`, and so on. This doesn't reliably get you _any_ useful offset from the symbol, because the relocations are processed independently, so that a carry out of the low 16 bits won't be taken into account in the next 16. If you do need to compute a large offset from the symbol, you have no option but to use SHT_RELA and specify a full 64-bit addend: there's no way to represent that in an SHT_REL setup. But interpreting the SHT_REL addends in the way specified here, you can at least specify _small_ addends successfully. --- aaelf64/aaelf64.rst | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/aaelf64/aaelf64.rst b/aaelf64/aaelf64.rst index c106b47..7fbf12e 100644 --- a/aaelf64/aaelf64.rst +++ b/aaelf64/aaelf64.rst @@ -940,6 +940,15 @@ A ``RELA`` format relocation must be used if the initial addend cannot be encode There is no PC bias to accommodate in the relocation of a place containing an instruction that formulates a PC- relative address. The program counter reflects the address of the currently executing instruction. +There are two special cases for forming the initial addend of REL-type relocations where the immediate field cannot normally hold small signed integers: + +* For relocations processing MOVZ and MOVK instructions (including the "MOV (wide immediate)" alias), the initial addend is formed by interpreting the 16-bit literal field of the instruction as a 16-bit signed value in the range -32768 <= A < 32768. The interpretation is the same whether or not the instruction applies a left shift to its immediate: the addend is never treated as shifted. + +* For relocations processing the ADRP instruction, the initial addend is similarly formed by interpreting the literal field of the instruction as a 21-bit signed integer, in the range -1048576 <= A < 1048576. The ADRP instruction's implicit left shift of 12 bits is not applied. + +These special cases permit a sequence of instructions to each add the same small constant to a symbol's value, and extract separate ranges of bits from the sum, so that the instruction sequence as a whole consistently loads the full result of the addition. + +In the case of a sequence using ADRP followed by a 12-bit ADD to set up the low bits of the offset, you can express an offset up to 1048576 in either direction, by writing the full offset in the ADRP's immediate field, and repeating its low 12 bits in the ADD's immediate field. A linker resolving the R_AARCH64_ADD_ABS_LO12_NC relocation on the ADD will not compute the correct overall 64-bit value, but the error will only be in the higher bits, which are discarded by that relocation. Relocation types ^^^^^^^^^^^^^^^^ From c3d0c0be74d71c267bbc35853f4bcbf276a455f1 Mon Sep 17 00:00:00 2001 From: Oliver Stannard Date: Fri, 9 Aug 2024 09:35:59 +0100 Subject: [PATCH 11/13] [AAPCS64] Use oxford comma in soft-float ABI --- aapcs64/aapcs64.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/aapcs64/aapcs64.rst b/aapcs64/aapcs64.rst index 9c7df5b..28477bd 100644 --- a/aapcs64/aapcs64.rst +++ b/aapcs64/aapcs64.rst @@ -2046,7 +2046,7 @@ The soft-float variant is defined identically to the base variant, with these ch * The definition of ``va_list`` is unchanged. The ``__vr_top`` and ``__vr_offs`` fields may be left uninitialised by ``va_start``, and their values must not be relied on by ``va_arg``. -* The calling convention for Short Vector, Scalable Vector and Scalable Predicate machine types is left undefined. +* The calling convention for Short Vector, Scalable Vector, and Scalable Predicate machine types is left undefined. .. raw:: pdf From 4b25a26b6871a175b111b3dab7eea44cd2106617 Mon Sep 17 00:00:00 2001 From: Ties Stuij Date: Mon, 12 Aug 2024 11:28:33 +0100 Subject: [PATCH 12/13] [NFC] address language suggestions across various documents --- aaelf64/aaelf64.rst | 10 +++++----- aapcs64/aapcs64.rst | 2 +- memtagabielf64/memtagabielf64.rst | 15 +++++++++------ 3 files changed, 15 insertions(+), 12 deletions(-) diff --git a/aaelf64/aaelf64.rst b/aaelf64/aaelf64.rst index 7fbf12e..ee8cd26 100644 --- a/aaelf64/aaelf64.rst +++ b/aaelf64/aaelf64.rst @@ -1870,27 +1870,27 @@ At this time this ABI specifies no generic platform architecture compatibility d Program Property ---------------- -The information on Program Property has been moved to [SYSVABI64_]. +The information on Program Property has moved to [SYSVABI64_]. Program Loading --------------- -The information on program loading has been moved to [SYSVABI64_]. +The information on program loading has moved to [SYSVABI64_]. Dynamic Linking --------------- -The information on Dynamic Linking has been moved to [SYSVABI64_]. +The information on Dynamic Linking has moved to [SYSVABI64_]. Dynamic Section --------------- -The information on the Dynamic Section has been moved to [SYSVABI64_]. +The information on the Dynamic Section has moved to [SYSVABI64_]. Custom PLTs ^^^^^^^^^^^^ -The information on custom PLTs has been moved to [SYSVABI64_]. +The information on custom PLTs has moved to [SYSVABI64_]. .. raw:: pdf diff --git a/aapcs64/aapcs64.rst b/aapcs64/aapcs64.rst index 28477bd..e0415ed 100644 --- a/aapcs64/aapcs64.rst +++ b/aapcs64/aapcs64.rst @@ -2282,7 +2282,7 @@ a lazy save, with the subroutine having the following properties: **(Beta)** Platforms that support SME must provide a subroutine to query the current -value of VG, with the subroutine having the following properties: +value of VG, and the subroutine must have the following properties: * The subroutine is called ``__arm_get_current_vg``. diff --git a/memtagabielf64/memtagabielf64.rst b/memtagabielf64/memtagabielf64.rst index 3dec531..83f5fb3 100644 --- a/memtagabielf64/memtagabielf64.rst +++ b/memtagabielf64/memtagabielf64.rst @@ -380,21 +380,24 @@ MemtagABI adds the following processor-specific dynamic array tags: respectively). Binaries compiled with clang-17 and lld-17 produced the dynamic entries with ``DT_AARCH64_MEMTAG_STACK`` as ``d_val`` and ``DT_AARCH64_MEMTAG_GLOBALS`` as ``d_ptr`` as this was the intended semantics, - and they were shipped on Android devices. The values were thus updated to + and they were shipped on Android devices. The values were updated to their more fitting semantic types (without updating the ``Value``), but this does mean that ``DT_AARCH64_MEMTAG_STACK`` is a ``d_val`` with an even ``Value``, and ``DT_AARCH64_MEMTAG_GLOBALS`` is a ``d_ptr`` with an odd ``Value`` (where the normal semantics are ``odd == d_val``, and ``even == - d_ptr``). Implementations of dynamic loaders need to be careful to apply these + d_ptr``). Implementations of dynamic loaders should be careful to apply these semantics correctly - notably the load bias should not be applied to ``DT_AARCH64_MEMTAG_STACK``, as it's a ``d_val``, even though the ``Value`` is even. ``DT_AARCH64_MEMTAG_MODE`` indicates the initial MTE mode that should be set. It -has two possible values: ``0``, indicating that the desired MTE mode is -Synchronous, and ``1``, indicating that the desired mode is Asynchronous. This -entry is only valid on the main executable, usage in dynamically loaded objects -is ignored. +has two possible values: + +* ``0``, indicating that the desired MTE mode is Synchronous +* ``1``, indicating that the desired mode is Asynchronous. + +This entry is only valid on the main executable, usage in dynamically loaded +objects is ignored. The presence of the ``DT_AARCH64_MEMTAG_HEAP`` dynamic array entry indicates that heap allocations should be protected with memory tagging. Implementation of From 201a7cb898c443b6d1c7aa3b8abdaaa55ebf02d8 Mon Sep 17 00:00:00 2001 From: Peter Smith Date: Tue, 2 Jul 2024 16:44:29 +0100 Subject: [PATCH 13/13] [aaelf64][pauthabi64] Remove addend in GDAT relocation operation The GDAT(S + A) relocation operation requires a static linker to create a GOT entry for (S + A). Requiring at least one GOT entry for each unique tuple (S, A). Unfortunately no known static linker has implemented this correctly, with one of two forms being implemented instead: * GDAT(S) with the addend ignored. * GDAT(S) + A with a single GOT entry per S, and A added to the value of GDAT(S). These implementations are correct and consistent only for an addend (A) of zero. No known compiler uses non-zero addends in relocations that use the GDAT(S+A) operation, although it is possible to generate them using assembly language. This change synchronizes the ABI with the behavior of existing static linker implementations. The benefit of permitting code generators [*] to use a non zero addend in GDAT(S + A) is judged to be lower than implementing GDAT(S + A) correctly in existing static linkers, many of which assume that there is a single GOT entry per unique symbol S. It is QoI whether a static linker gives an error if a non zero addend is used for a relocation that uses the GDAT(S) operation. Fixes https://github.com/ARM-software/abi-aa/issues/217 Also resolves https://github.com/ARM-software/abi-aa/pull/247 [*] The most common use case for a non-zero addend is in constructing a C++ object with a vtable. The first two entries in the vtable are the offset to top and a pointer to RTTI, the vtable pointer in the object starts at offset 0x10. This offset can be encoded in the relocation addend. We would save an add instruction for each construction of a C++ object with a vtable if addends were permitted. --- aaelf64/aaelf64.rst | 31 ++++++++++++++++--------------- pauthabielf64/pauthabielf64.rst | 28 ++++++++++++++-------------- 2 files changed, 30 insertions(+), 29 deletions(-) diff --git a/aaelf64/aaelf64.rst b/aaelf64/aaelf64.rst index ee8cd26..c0c0978 100644 --- a/aaelf64/aaelf64.rst +++ b/aaelf64/aaelf64.rst @@ -1036,6 +1036,7 @@ In ELF32 **(Beta)** relocations additional care must be taken when relocating an R__TLSIE_ADR_GOTTPREL_PAGE21, R__TLSDESC_ADR_PAGE21 +Relocations using the ``GDAT(S)`` operation must have a zero addend. Previous versions of this document included the addend ``A`` in ``GDAT(S + A)`` resulting in a GOT entry for ``S + A``. With a zero addend ``GDAT(S + 0)`` is equivalent to ``GDAT(S)`` and ``GDAT(S) + 0``. Static miscellaneous relocations ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ @@ -1240,19 +1241,19 @@ The following tables record single instruction relocations and relocations that +------------+------------+--------------------------------+-------------------+----------------------------------------------------------------------+ | ELF64 Code | ELF32 Code | Name | Operation | Comment | +============+============+================================+===================+======================================================================+ - | 300 | \- | R\_\_MOVW\_GOTOFF\_G0 | G(GDAT(S+A)) -GOT | Set a MOV[NZ] immediate field to bits [15:0] of X (see notes below) | + | 300 | \- | R\_\_MOVW\_GOTOFF\_G0 | G(GDAT(S)) -GOT | Set a MOV[NZ] immediate field to bits [15:0] of X (see notes below) | +------------+------------+--------------------------------+-------------------+----------------------------------------------------------------------+ - | 301 | \- | R\_\_MOVW\_GOTOFF\_G0\_NC | G(GDAT(S+A)) -GOT | Set a MOVK immediate field to bits [15:0] of X. No overflow check | + | 301 | \- | R\_\_MOVW\_GOTOFF\_G0\_NC | G(GDAT(S)) -GOT | Set a MOVK immediate field to bits [15:0] of X. No overflow check | +------------+------------+--------------------------------+-------------------+----------------------------------------------------------------------+ - | 302 | \- | R\_\_MOVW\_GOTOFF\_G1 | G(GDAT(S+A)) -GOT | Set a MOV[NZ] immediate value to bits [31:16] of X (see notes below) | + | 302 | \- | R\_\_MOVW\_GOTOFF\_G1 | G(GDAT(S)) -GOT | Set a MOV[NZ] immediate value to bits [31:16] of X (see notes below) | +------------+------------+--------------------------------+-------------------+----------------------------------------------------------------------+ - | 303 | \- | R\_\_MOVW\_GOTOFF\_G1\_NC | G(GDAT(S+A)) -GOT | Set a MOVK immediate value to bits [31:16] of X. No overflow check | + | 303 | \- | R\_\_MOVW\_GOTOFF\_G1\_NC | G(GDAT(S)) -GOT | Set a MOVK immediate value to bits [31:16] of X. No overflow check | +------------+------------+--------------------------------+-------------------+----------------------------------------------------------------------+ - | 304 | \- | R\_\_MOVW\_GOTOFF\_G2 | G(GDAT(S+A)) -GOT | Set a MOV[NZ] immediate value to bits [47:32] of X (see notes below) | + | 304 | \- | R\_\_MOVW\_GOTOFF\_G2 | G(GDAT(S)) -GOT | Set a MOV[NZ] immediate value to bits [47:32] of X (see notes below) | +------------+------------+--------------------------------+-------------------+----------------------------------------------------------------------+ - | 305 | \- | R\_\_MOVW\_GOTOFF\_G2\_NC | G(GDAT(S+A)) -GOT | Set a MOVK immediate value to bits [47:32] of X. No overflow check | + | 305 | \- | R\_\_MOVW\_GOTOFF\_G2\_NC | G(GDAT(S)) -GOT | Set a MOVK immediate value to bits [47:32] of X. No overflow check | +------------+------------+--------------------------------+-------------------+----------------------------------------------------------------------+ - | 306 | \- | R\_\_MOVW\_GOTOFF\_G3 | G(GDAT(S+A)) -GOT | Set a MOV[NZ] immediate value to bits [63:48] of X (see notes below) | + | 306 | \- | R\_\_MOVW\_GOTOFF\_G3 | G(GDAT(S)) -GOT | Set a MOV[NZ] immediate value to bits [63:48] of X (see notes below) | +------------+------------+--------------------------------+-------------------+----------------------------------------------------------------------+ .. note:: @@ -1274,7 +1275,7 @@ The following tables record single instruction relocations and relocations that | 308 | \- | R\_\_GOTREL32 | S+A-GOT | Write bits [31:0] of X at byte-aligned place P. This represents a 32-bit offset relative to GOT, treated as signed; | | | | | | Check that -2\ :sup:`31` <= X < 2\ :sup:`31`. | +------------+------------+----------------------+------------------+-------------------------------------------------------------------------------------------------------------------------+ - | 315 | \- | R\_\_GOTPCREL32 | G(GDAT(S+A))- P | Write bits [31:0] of X at byte-aligned place P. This represents a 32-bit offset relative to GOT entry for an address, | + | 315 | \- | R\_\_GOTPCREL32 | G(GDAT(S))- P | Write bits [31:0] of X at byte-aligned place P. This represents a 32-bit offset relative to GOT entry for an address, | | | | | | treated as signed; Check that -2\ :sup:`31` <= X < 2\ :sup:`31`. | +------------+------------+----------------------+------------------+-------------------------------------------------------------------------------------------------------------------------+ @@ -1287,19 +1288,19 @@ The following tables record single instruction relocations and relocations that +-------------+------------+-------------------------------+----------------------------+------------------------------------------------------------------------------------------------------+ | ELF64 Code | ELF32 Code | Name | Operation | Comment | +=============+============+===============================+============================+======================================================================================================+ - | 309 | 25 | R\_\_GOT\_LD\_PREL19 | G(GDAT(S+A))- P | Set a load-literal immediate field to bits [20:2] of X; check –2\ :sup:`20` <= X < 2\ :sup:`20` | + | 309 | 25 | R\_\_GOT\_LD\_PREL19 | G(GDAT(S))- P | Set a load-literal immediate field to bits [20:2] of X; check –2\ :sup:`20` <= X < 2\ :sup:`20` | +-------------+------------+-------------------------------+----------------------------+------------------------------------------------------------------------------------------------------+ - | 310 | \- | R\_\_LD64\_GOTOFF\_LO15 | G(GDAT(S+A))- GOT | Set a LD/ST immediate field to bits [14:3] of X; check that 0 <= X < 2\ :sup:`15`, X&7 = 0 | + | 310 | \- | R\_\_LD64\_GOTOFF\_LO15 | G(GDAT(S))- GOT | Set a LD/ST immediate field to bits [14:3] of X; check that 0 <= X < 2\ :sup:`15`, X&7 = 0 | +-------------+------------+-------------------------------+----------------------------+------------------------------------------------------------------------------------------------------+ - | 311 | 26 | R\_\_ADR\_GOT\_PAGE | Page(G(GDAT(S+A)))-Page(P) | Set the immediate value of an ADRP to bits [32:12] of X; check that –2\ :sup:`32` <= X < 2\ :sup:`32`| + | 311 | 26 | R\_\_ADR\_GOT\_PAGE | Page(G(GDAT(S)))-Page(P) | Set the immediate value of an ADRP to bits [32:12] of X; check that –2\ :sup:`32` <= X < 2\ :sup:`32`| +-------------+------------+-------------------------------+----------------------------+------------------------------------------------------------------------------------------------------+ - | 312 | \- | R\_\_LD64\_GOT\_LO12\_NC | G(GDAT(S+A)) | Set the LD/ST immediate field to bits [11:3] of X. No overflow check; check that X&7 = 0 | + | 312 | \- | R\_\_LD64\_GOT\_LO12\_NC | G(GDAT(S)) | Set the LD/ST immediate field to bits [11:3] of X. No overflow check; check that X&7 = 0 | +-------------+------------+-------------------------------+----------------------------+------------------------------------------------------------------------------------------------------+ - | \- | 27 | R\_\_LD32\_GOT\_LO12\_NC | G(GDAT(S+A)) | Set the LD/ST immediate field to bits [11:2] of X. No overflow check; check that X&3 = 0 | + | \- | 27 | R\_\_LD32\_GOT\_LO12\_NC | G(GDAT(S)) | Set the LD/ST immediate field to bits [11:2] of X. No overflow check; check that X&3 = 0 | +-------------+------------+-------------------------------+----------------------------+------------------------------------------------------------------------------------------------------+ - | 313 | \- | R\_\_LD64\_GOTPAGE\_LO15 | G(GDAT(S+A))-Page(GOT) | Set the LD/ST immediate field to bits [14:3] of X; check that 0 <= X < 2\ :sup:`15`, X&7 = 0 | + | 313 | \- | R\_\_LD64\_GOTPAGE\_LO15 | G(GDAT(S))-Page(GOT) | Set the LD/ST immediate field to bits [14:3] of X; check that 0 <= X < 2\ :sup:`15`, X&7 = 0 | +-------------+------------+-------------------------------+----------------------------+------------------------------------------------------------------------------------------------------+ - | \- | 28 | R\_\_LD32\_GOTPAGE\_LO14 | G(GDAT(S+A))-Page(GOT) | Set the LD/ST immediate field to bits [13:2] of X; check that 0 <= X < 2\ :sup:`14`, X&3 = 0 | + | \- | 28 | R\_\_LD32\_GOTPAGE\_LO14 | G(GDAT(S))-Page(GOT) | Set the LD/ST immediate field to bits [13:2] of X; check that 0 <= X < 2\ :sup:`14`, X&3 = 0 | +-------------+------------+-------------------------------+----------------------------+------------------------------------------------------------------------------------------------------+ diff --git a/pauthabielf64/pauthabielf64.rst b/pauthabielf64/pauthabielf64.rst index 9ff188a..d3b3196 100644 --- a/pauthabielf64/pauthabielf64.rst +++ b/pauthabielf64/pauthabielf64.rst @@ -1109,67 +1109,67 @@ The GOT entries must be relocated by AUTH variant dynamic relocations. +-------------+----------------------------------------+----------------------------------+--------------------------+ | ELF 64 Code | Name | Operation | Comment | +=============+========================================+==================================+==========================+ - | 0x8110 | R\_AARCH64\_AUTH\_MOVW\_GOTOFF\_G0 | G(ENCD(GDAT(S + A))) - GOT | Set a MOV[NZ] immediate | + | 0x8110 | R\_AARCH64\_AUTH\_MOVW\_GOTOFF\_G0 | G(ENCD(GDAT(S))) - GOT | Set a MOV[NZ] immediate | | | | | field to bits [15:0] of | | | | | X (see notes below) | +-------------+----------------------------------------+----------------------------------+--------------------------+ - | 0x8111 | R\_AARCH64\_AUTH\_MOVW\_GOTOFF\_G0\_NC | G(ENCD(GDAT(S + A))) - GOT | Set a MOV[NZ] immediate | + | 0x8111 | R\_AARCH64\_AUTH\_MOVW\_GOTOFF\_G0\_NC | G(ENCD(GDAT(S))) - GOT | Set a MOV[NZ] immediate | | | | | field to bits [15:0] of | | | | | X (see notes below) | +-------------+----------------------------------------+----------------------------------+--------------------------+ - | 0x8112 | R\_AARCH64\_AUTH\_MOVW\_GOTOFF\_G1 | G(ENCD(GDAT(S + A))) - GOT | Set a MOV[NZ] immediate | + | 0x8112 | R\_AARCH64\_AUTH\_MOVW\_GOTOFF\_G1 | G(ENCD(GDAT(S))) - GOT | Set a MOV[NZ] immediate | | | | | field to bits [31:16] of | | | | | X (see notes below) | +-------------+----------------------------------------+----------------------------------+--------------------------+ - | 0x8113 | R\_AARCH64\_AUTH\_MOVW\_GOTOFF\_G1\_NC | G(ENCD(GDAT(S + A))) - GOT | Set a MOV[NZ] immediate | + | 0x8113 | R\_AARCH64\_AUTH\_MOVW\_GOTOFF\_G1\_NC | G(ENCD(GDAT(S))) - GOT | Set a MOV[NZ] immediate | | | | | field to bits [31:16] of | | | | | X (see notes below) | +-------------+----------------------------------------+----------------------------------+--------------------------+ - | 0x8114 | R\_AARCH64\_AUTH\_MOVW\_GOTOFF\_G2 | G(ENCD(GDAT(S + A))) - GOT | Set a MOV[NZ] immediate | + | 0x8114 | R\_AARCH64\_AUTH\_MOVW\_GOTOFF\_G2 | G(ENCD(GDAT(S))) - GOT | Set a MOV[NZ] immediate | | | | | field to bits [47:32] of | | | | | X (see notes below) | +-------------+----------------------------------------+----------------------------------+--------------------------+ - | 0x8115 | R\_AARCH64\_AUTH\_MOVW\_GOTOFF\_G2\_NC | G(ENCD(GDAT(S + A))) - GOT | Set a MOV[NZ] immediate | + | 0x8115 | R\_AARCH64\_AUTH\_MOVW\_GOTOFF\_G2\_NC | G(ENCD(GDAT(S))) - GOT | Set a MOV[NZ] immediate | | | | | field to bits [47:32] of | | | | | X (see notes below) | +-------------+----------------------------------------+----------------------------------+--------------------------+ - | 0x8116 | R\_AARCH64\_AUTH\_MOVW\_GOTOFF\_G3 | G(ENCD(GDAT(S + A))) - GOT | Set a MOV[NZ] immediate | + | 0x8116 | R\_AARCH64\_AUTH\_MOVW\_GOTOFF\_G3 | G(ENCD(GDAT(S))) - GOT | Set a MOV[NZ] immediate | | | | | field to bits [63:48] of | | | | | X (see notes below) | +-------------+----------------------------------------+----------------------------------+--------------------------+ - | 0x8117 | R\_AARCH64\_AUTH\_GOT\_LD\_PREL19 | G(ENCD(GDAT(S + A))) - P | Set a load-literal im- | + | 0x8117 | R\_AARCH64\_AUTH\_GOT\_LD\_PREL19 | G(ENCD(GDAT(S))) - P | Set a load-literal im- | | | | | mediate field to bits | | | | | [20:2] of X; check | | | | | –2\ :sup:`20` <= | | | | | X < 2 \ :sup:`20` | +-------------+----------------------------------------+----------------------------------+--------------------------+ - | 0x8118 | R\_AARCH64\_AUTH\_LD64\_GOTOFF\_LO15 | G(ENCD(GDAT(S + A))) - GOT | Set the immediate | + | 0x8118 | R\_AARCH64\_AUTH\_LD64\_GOTOFF\_LO15 | G(ENCD(GDAT(S))) - GOT | Set the immediate | | | | | value of an ADRP | | | | | to bits [32:12] of X; | | | | | check that –2\ :sup:`32` | | | | | <= X < 2\ :sup:`32` | +-------------+----------------------------------------+----------------------------------+--------------------------+ - | 0x8119 | R\_AARCH64\_AUTH\_ADR\_GOT\_PAGE | G(ENCD(GDAT(S + A))) - Page(P) | Set the immediate | + | 0x8119 | R\_AARCH64\_AUTH\_ADR\_GOT\_PAGE | G(ENCD(GDAT(S))) - Page(P) | Set the immediate | | | | | value of an ADRP | | | | | to bits [32:12] of X; | | | | | check that –2\ :sup:`32` | | | | | <= X < 2\ :sup:`32` | +-------------+----------------------------------------+----------------------------------+--------------------------+ - | 0x811A | R\_AARCH64\_AUTH\_LD64\_GOT\_LO12_NC | G(ENCD(GDAT(S + A))) | Set the LD/ST immediate | + | 0x811A | R\_AARCH64\_AUTH\_LD64\_GOT\_LO12_NC | G(ENCD(GDAT(S))) | Set the LD/ST immediate | | | | | field to bits [11:3] of | | | | | X. No overflow check; | | | | | check that X&7 = 0 | +-------------+----------------------------------------+----------------------------------+--------------------------+ - | 0x811B | R\_AARCH64\_AUTH\_LD64\_GOTPAGE\_LO15 | G(ENCD(GDAT(S + A))) - Page(GOT) | Set the LD/ST immediate | + | 0x811B | R\_AARCH64\_AUTH\_LD64\_GOTPAGE\_LO15 | G(ENCD(GDAT(S))) - Page(GOT) | Set the LD/ST immediate | | | | | field to bits [14:3] of | | | | | X; check that 0 <= X < | | | | | 2\ :sup:`15` | +-------------+----------------------------------------+----------------------------------+--------------------------+ - | 0x811C | R\AARCH64\_AUTH\_GOT\_ADD_LO12_NC | G(ENCD(GDAT(S + A))) | Set an ADD immediate | + | 0x811C | R\AARCH64\_AUTH\_GOT\_ADD_LO12_NC | G(ENCD(GDAT(S))) | Set an ADD immediate | | | | | value to bits [11:0] of | | | | | X. No overflow check. | +-------------+----------------------------------------+----------------------------------+--------------------------+ - | 0x811D | R\AARCH64\_AUTH\_GOT\_ADR\_PREL\_LO21 | G(ENCD(GDAT(S + A))) - P | Set the immediate | + | 0x811D | R\AARCH64\_AUTH\_GOT\_ADR\_PREL\_LO21 | G(ENCD(GDAT(S))) - P | Set the immediate | | | | | value to bits[20:0] of X;| | | | | check that -2 :sup:`20` | | | | | <= 2 :sup: `20` |