Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add R_LARCH_JIRL_LO12 #3

Closed
wants to merge 1 commit into from
Closed

Add R_LARCH_JIRL_LO12 #3

wants to merge 1 commit into from

Conversation

xen0n
Copy link

@xen0n xen0n commented Jul 9, 2023

Continuing the (unstarted) discussion at loongson/LoongArch-Documentation#69 where the repo itself is archived but ongoing discussion around it is still happening: e.g. at https://reviews.llvm.org/D138135 (have to justify the workarounds for R_LARCH_PCALA_LO12).

With "medium" code model, we call a function with a pair of PCALAU12I
and JIRL instructions.  Currently the assembler produces something like:

   8:	1a00000c 	pcalau12i   	$t0, 0
			8: R_LARCH_PCALA_HI20	g
   c:	4c000181 	jirl        	$ra, $t0, 0
			c: R_LARCH_PCALA_LO12	g

This is problematic:

(1) If we just apply the R_LARCH_PCALA_LO12 relocation following its
definition, we'll got wrong result: jirl is 2RI16-type but
R_LARCH_PCALA_LO12 is for 2RI12-type.

(2) The linker need to produce a PLT entry if `g` is an external
function.  Currently ld.bfd *assumes* if there is an R_LARCH_PCALA_HI20
against an external symbol, `g` "should be" an external function.  But
this assumption is only true if the programmers (and the compiler) do
not make any error.  Consider if a programmer moved `data` (which is not
a function) from one shared object into another, but forgot to update

    la.local $t0, data
    ld.d     $t0, $t0, 0

in the code.  Now the linker generates a "PLT entry" for data without
any diagnostic, and `ld.d` instruction loads two instructions in the
PLT.  The programmer can only notice something wrong (as a mysterious
crash or bad output) at runtime.

To avoid those issues, currently in the BFD linker extra semantics are
added to R_LARCH_PCALA_LO12 like "if it's applied to a jirl instruction,
blah blah..." but it's too tricky and necessitates the inspection of
section contents, that can increase the memory footprint of the linker
and harm locality of linker memory accesses. So it's better to use a new
relocation type for function call in the medium code model, and all the
downsides can be avoided.

[xen0n: migrate from loongson/LoongArch-Documentation#69 and adjust
the last paragraph of commit message to reflect reality since the
original PR was written]
@xen0n
Copy link
Author

xen0n commented Jul 9, 2023

The only other information at the original site is:

  • The original PR was created at 2022-09-22 12:26 (UTC+8).
  • I reacted on the PR with a thumbs-up and posted a single reply (reproduced below). No Loongson person gave any response.

My reply (at 2023-01-22 10:57, UTC+8):

For the record: this is causing significant pain for the LLD port, where relocation semantics (the RelExpr enum) has to be determined early and ideally without depending on other relocs/input content. It is somewhat easy to treat R_LARCH_PCALA_LO12 on jirl differently, but it's much more difficult to differentiate between R_LARCH_PCALA_HI20's that produce intermediate result for a jirl and those not, because we're unlike RISC-V where the R_RISCV_PCREL_LO12's actually point to the corresponding HI20 reloc so correspondence is preserved.

@cloudspurs
Copy link

For medium code model function call, we plan to change to pcaddu18i + jirl. And two new relocations R_LARCH_CALL_HI20, R_LARCH_CALL_LO16.

@xen0n
Copy link
Author

xen0n commented Aug 11, 2023

+1 on pcaddu18i + jirl (apparently pcaddu18i was created just for this purpose).

However I do recommend making the reloc names clearer with respect to the PC-relative semantics: R_LARCH_CALL_PCREL_HI20 and R_LARCH_CALL_PCREL_LO16 could be better, as the existing psABI v2 relocations are all following PC-aligned semantics (which means the calculation is entirely different).

@SixWeining
Copy link
Collaborator

If we can ensure pcaddu18i is close to jirl, could we only add one reloc type R_LARCH_B36 or something else, e.g. R_LARCH_CALL36, or R_LARCH_JUMP36, and the linker applies relocations on the adjacent 2 insns in one time.

@xen0n
Copy link
Author

xen0n commented Sep 12, 2023

If we can ensure pcaddu18i is close to jirl, could we only add one reloc type R_LARCH_B36 or something else, e.g. R_LARCH_CALL36, or R_LARCH_JUMP36, and the linker applies relocations on the adjacent 2 insns in one time.

I think the two parts may be scheduled so that they're no longer adjacent. Unless we have some kind of "macro-op fusion" guarantees/recommendations put into the specs, in which case compilers could be willing to avoid scheduling these.

Alternatively, we could investigate whether the RISC-V approach of allowing one reloc to point to another reloc -- allowing to properly track associations and differentiate behavior based on the associated reloc. That way I imagine several relocs could be merged or "fixed" (see the LLD port code for some of the problems caused by sharing reloc type but not having proper association) compared to what we have right now.

@SixWeining
Copy link
Collaborator

I think this can be closed after #4 since we have R_LARCH_CALL36 now.

@xen0n
Copy link
Author

xen0n commented Nov 4, 2023

I think this can be closed after #4 since we have R_LARCH_CALL36 now.

I have thought about this in the meantime, and it's probably fair to require the medium code model indirect call sequence to be adjacent insns, because then any micro-architecture optimization would become easier.

Given we cannot ditch the R_LARCH_PCALA_LO12 hack without breaking compatibility, the #4 addition of R_LARCH_CALL36 seems the next best / only possible thing to have. I agree this issue could be closed, thanks.

@xen0n xen0n closed this Nov 4, 2023
@xry111
Copy link

xry111 commented Nov 23, 2023

Unfortunately we may still want R_LARCH_JIRL_LO12 for extreme code model. See rui314/mold#1131 (comment).

SixWeining added a commit to SixWeining/llvm-project that referenced this pull request Dec 7, 2023
R_LARCH_CALL36 was designed for function call on medium code model
where the 2 instructions (pcaddu18i + jirl) must be adjacent. This is
expected to replace current medium code model implementation, i.e.
R_LARCH_PCALA_{HI20,LO12} on pcalau12i + jirl.

See loongson/la-abi-specs#3 for more details.
SixWeining added a commit to llvm/llvm-project that referenced this pull request Dec 25, 2023
R_LARCH_CALL36 was designed for function call on medium code model where
the 2 instructions (pcaddu18i + jirl) must be adjacent. This is expected
to replace current medium code model implementation, i.e.
R_LARCH_PCALA_{HI20,LO12} on pcalau12i + jirl.

See loongson/la-abi-specs#3 for more details.
wanghao75 pushed a commit to openeuler-mirror/llvm-project that referenced this pull request Oct 9, 2024
R_LARCH_CALL36 was designed for function call on medium code model where
the 2 instructions (pcaddu18i + jirl) must be adjacent. This is expected
to replace current medium code model implementation, i.e.
R_LARCH_PCALA_{HI20,LO12} on pcalau12i + jirl.

See loongson/la-abi-specs#3 for more details.

(cherry picked from commit 88548df)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants