-
Notifications
You must be signed in to change notification settings - Fork 194
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AAELF64] Clarify how addends work in MOVZ, MOVK and ADRP. #271
[AAELF64] Clarify how addends work in MOVZ, MOVK and ADRP. #271
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One small comment about the MOV alias but that isn't a strong opinion.
As I understand it the sequence
ADRP x0, symbol + addend
ADD x0, x0, :lo12: symbol +addend
Would likely limit the size of the addend to ADRP to same addend [0-4095] that could be expressed in the ADD/LDR.
I would like to think there's a way of relating addend1 and addend2 so that we always get the right offset within the page.
ADRP x0, symbol + addend1
ADD x0, x0, :lo12: symbol +addend2
Off the top of my head, something like addend2 is congruent to addend1 (modulo 4096). Would need to think about that in more detail as I could easily have that wrong.
If there is such an expression we could add that as a note.
aaelf64/aaelf64.rst
Outdated
@@ -940,6 +940,13 @@ A ``RELA`` format relocation must be used if the initial addend cannot be encode | |||
|
|||
There is no PC bias to accommodate in the relocation of a place containing an instruction that formulates a PC- relative address. The program counter reflects the address of the currently executing instruction. | |||
|
|||
There are two special cases for forming the initial addend of REL-type relocations where the immediate field cannot normally hold small signed integers: | |||
|
|||
* For relocations processing MOVZ and MOVK instructions, the initial addend is formed by interpreting the 16-bit literal field of the instruction as a 16-bit signed value in the range -32768 <= A < 32768. The interpretation is the same whether or not the instruction applies a left shift to its immediate: the addend is never treated as shifted. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be worth including MOV (wide immediate) using the Arm ARM description. For example
For relocations processing MOV (wide immediate), MOVZ and MOVK instructions,
MOV (wide immediate) is an alias for MOVZ so this is included in MOVZ and MOVK, but MOV is used a lot in code examples.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I dithered about that myself. Added a parenthesis.
Nice thought! I think you're absolutely right: you can make ADRP+ADD consistently generate up to a 19-bit signed offset even though the ADD immediate can only hold 12 bits of it, by just setting the ADD immediate to the low 12 bits. The error in computing the full value during processing of the ADD relocation doesn't matter, because it's only in bits 12 and above, which don't affect the output. |
47d5a43
to
a1066c7
Compare
This brings AAELF64 into line with AAELF32, which already has a similar clarification for the MOVW+MOVT pair. For the instructions which shift their operand left (ADRP, and the shifted MOVZ and MOVK), if the relocation addend is taken from the input value of the immediate field, it is not treated as shifted. The rationale is that this allows a sequence of related instructions to consistently compute the same value (symbol + small offset), and cooperate to load that value into the target register, one small chunk at a time. For example, this would load `mySymbol + 0x123`: mov x0, #0x123 ; R_AARCH64_MOVW_UABS_G0_NC(mySymbol) movk x0, #0x123, lsl ARM-software#16 ; R_AARCH64_MOVW_UABS_G1_NC(mySymbol) movk x0, #0x123, lsl ARM-software#32 ; R_AARCH64_MOVW_UABS_G2_NC(mySymbol) movk x0, #0x123, lsl ARM-software#48 ; R_AARCH64_MOVW_UABS_G3(mySymbol) The existing text made it unclear whether the addends were shifted or not. If they are interpreted as shifted, then nothing useful happens, because the first instruction would load the low 16 bits of `mySymbol+0x123`, and the second would load the next 16 bits of `mySymbol+0x1230000`, and so on. This doesn't reliably get you _any_ useful offset from the symbol, because the relocations are processed independently, so that a carry out of the low 16 bits won't be taken into account in the next 16. If you do need to compute a large offset from the symbol, you have no option but to use SHT_RELA and specify a full 64-bit addend: there's no way to represent that in an SHT_REL setup. But interpreting the SHT_REL addends in the way specified here, you can at least specify _small_ addends successfully.
a1066c7
to
5cc35e7
Compare
Oops! The ADRP immediate is 21 bits, not 19. Almost didn't notice that the encoding has two more immediate bits near the top of the word. |
Is it really safe to make this change? The current text says:
5.7.2 Addends and PC-bias
<https://github.com/ARM-software/abi-aa/blob/main/aaelf64/aaelf64.rst#id49>
<https://github.com/ARM-software/abi-aa/blob/main/aaelf64/aaelf64.rst#572addends-and-pc-bias>
A binary file may use REL or RELA relocations or a mixture of the two (but
multiple relocations of the same place must use only one type).
The initial addend for a REL-type relocation is formed according to the
following rules.
- If the relocation relocates data (Static Data relocations
<https://github.com/ARM-software/abi-aa/blob/main/aaelf64/aaelf64.rst#static-data-relocations>)
the initial value in the place is sign-extended to 64 bits.
- If the relocation relocates an instruction the immediate field of the
instruction is extracted, scaled as required by the instruction field
encoding, and sign-extended to 64 bits.
A RELA format relocation must be used if the initial addend cannot be
encoded in the place.
So the question is, could existing code be using these fields in a
different way (interpreting the bits as scaled by the field encoding
information)? If they could, then this change would break existing usage
and thus cannot be safely made, even if it would have been more useful in
retrospect.
…On Mon, 1 Jul 2024 at 17:31, Simon Tatham ***@***.***> wrote:
This brings AAELF64 into line with AAELF32, which already has a similar
clarification for the MOVW+MOVT pair. For the instructions which shift
their operand left (ADRP, and the shifted MOVZ and MOVK), if the relocation
addend is taken from the input value of the immediate field, it is not
treated as shifted.
The rationale is that this allows a sequence of related instructions to
consistently compute the same value (symbol + small offset), and cooperate
to load that value into the target register, one small chunk at a time. For
example, this would load mySymbol + 0x123:
mov x0, #0x123 ; R_AARCH64_MOVW_UABS_G0_NC(mySymbol)
movk x0, #0x123, lsl #16
<#16> ;
R_AARCH64_MOVW_UABS_G1_NC(mySymbol)
movk x0, #0x123, lsl #32 <#32>
; R_AARCH64_MOVW_UABS_G2_NC(mySymbol)
movk x0, #0x123, lsl #48 <#48>
; R_AARCH64_MOVW_UABS_G3(mySymbol)
The existing text made it unclear whether the addends were shifted or not.
If they are interpreted as shifted, then nothing useful happens, because
the first instruction would load the low 16 bits of mySymbol+0x123, and
the second would load the next 16 bits of mySymbol+0x1230000, and so on.
This doesn't reliably get you *any* useful offset from the symbol,
because the relocations are processed independently, so that a carry out of
the low 16 bits won't be taken into account in the next 16.
If you do need to compute a large offset from the symbol, you have no
option but to use SHT_RELA and specify a full 64-bit addend: there's no way
to represent that in an SHT_REL setup. But interpreting the SHT_REL addends
in the way specified here, you can at least specify *small* addends
successfully.
------------------------------
You can view, comment on, or merge this pull request online at:
#271
Commit Summary
- 47d5a43
<47d5a43>
[AAELF64] Clarify how addends work in MOVZ, MOVK and ADRP.
File Changes
(1 file <https://github.com/ARM-software/abi-aa/pull/271/files>)
- *M* aaelf64/aaelf64.rst
<https://github.com/ARM-software/abi-aa/pull/271/files#diff-4af66823238bb53aa276f8f0c9e0196fe8707a57a3ed5e66a39360b1128061db>
(7)
Patch Links:
- https://github.com/ARM-software/abi-aa/pull/271.patch
- https://github.com/ARM-software/abi-aa/pull/271.diff
—
Reply to this email directly, view it on GitHub
<#271>, or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ANY64XDRO43BYRH3HU5BWETZKF76NAVCNFSM6AAAAABKF5E77GVHI2DSMVQWIX3LMV43ASLTON2WKOZSGM4DIMRWGA3TEMQ>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
As explained, it is not possible to use a non-zero offset with REL MOVZ/MOVK/ADRP/ADD/LDR relocations since they wouldn't compute the same relocation. |
While we can never be 100% certain, we've only got one known user of SHT_REL for AArch64 which is the Arm legacy assembler armasm, and that switches to RELA for the instructions with scaled addends like ADRP, MOVT/MOVW. Part of the motivation for this change is to correct an incorrect implementation in LLD. On the use case side, we can't be 100% certain, but I certainly can't think of a way a compiler could make use of the scaled immediates as relocation addends. I can remember from the 32-bit case that we first implemented the MOVT/MOVW in SHT_REL using scaled addends and soon ran into runtime failures which provoked the redefinition to not scale the addends. It may be possible to come up with something in assembly that isn't contrived. I think this is an area where I'd be comfortable taking the risk. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the updates, I'm approving this from my side. I suggest we leave this open for a few days for others to comment/object. Please can Request changes be used for any blocking feedback.
I'm sure there are use cases where the bottom 16 bits might be known to be
zero, then you're more likely to want something that's a multiple of 2^16.
…On Tue, 2 Jul 2024 at 14:28, Peter Smith ***@***.***> wrote:
So the question is, could existing code be using these fields in a
different way (interpreting the bits as scaled by the field encoding
information)? If they could, then this change would break existing usage
and thus cannot be safely made, even if it would have been more useful in
retrospect.
While we can never be 100% certain, we've only got one known user of
SHT_REL for AArch64 which is the Arm legacy assembler armasm, and that
switches to RELA for the instructions with scaled addends like ADRP,
MOVT/MOVW. Part of the motivation for this change is to correct an
incorrect implementation in LLD.
On the use case side, we can't be 100% certain, but I certainly can't
think of a way a compiler could make use of the scaled immediates as
relocation addends. I can remember from the 32-bit case that we first
implemented the MOVT/MOVW in SHT_REL using scaled addends and soon ran into
runtime failures which provoked the redefinition to not scale the addends.
It may be possible to come up with something in assembly that isn't
contrived.
I think this is an area where I'd be comfortable taking the risk.
—
Reply to this email directly, view it on GitHub
<#271 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ANY64XF63ERIOVGV6VUBPXTZKKTJFAVCNFSM6AAAAABKF5E77GVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEMBTGE3DIMRWGA>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
It would have to be an expression of the form Given how rare that this use-case is compared to sections that are 4/8 byte aligned I doubt that a potential SHT_REL only toolchain would implement section symbol + offset like that. I'd expect it to use local anonymous symbols as anchors. |
Will be merging this PR by end of UK working day 08/07/2024 |
End of UK working day. Merging this PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No changes required.
This brings AAELF64 into line with AAELF32, which already has a similar clarification for the MOVW+MOVT pair. For the instructions which shift their operand left (ADRP, and the shifted MOVZ and MOVK), if the relocation addend is taken from the input value of the immediate field, it is not treated as shifted.
The rationale is that this allows a sequence of related instructions to consistently compute the same value (symbol + small offset), and cooperate to load that value into the target register, one small chunk at a time. For example, this would load
mySymbol + 0x123
:mov x0, #0x123 ; R_AARCH64_MOVW_UABS_G0_NC(mySymbol)
movk x0, #0x123, lsl #16 ; R_AARCH64_MOVW_UABS_G1_NC(mySymbol)
movk x0, #0x123, lsl #32 ; R_AARCH64_MOVW_UABS_G2_NC(mySymbol)
movk x0, #0x123, lsl #48 ; R_AARCH64_MOVW_UABS_G3(mySymbol)
The existing text made it unclear whether the addends were shifted or not. If they are interpreted as shifted, then nothing useful happens, because the first instruction would load the low 16 bits of
mySymbol+0x123
, and the second would load the next 16 bits ofmySymbol+0x1230000
, and so on. This doesn't reliably get you any useful offset from the symbol, because the relocations are processed independently, so that a carry out of the low 16 bits won't be taken into account in the next 16.If you do need to compute a large offset from the symbol, you have no option but to use SHT_RELA and specify a full 64-bit addend: there's no way to represent that in an SHT_REL setup. But interpreting the SHT_REL addends in the way specified here, you can at least specify small addends successfully.