Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unallocatable global pointer (GP) #298

Closed
MaskRay opened this issue Jul 9, 2022 · 16 comments
Closed

Unallocatable global pointer (GP) #298

MaskRay opened this issue Jul 9, 2022 · 16 comments

Comments

@MaskRay
Copy link
Collaborator

MaskRay commented Jul 9, 2022

GNU ld can relax PC-relative addressing to GP-relative addressing in -no-pie mode (disable for -pie/-shared).

# a.s
.globl _start
_start:
  lui a0,%hi(var)
  addi a0,a0,%lo(var)

  lla a0,var
  ret

.data
.space 8
.globl var
var:
.zero 4
.size var, . - var

=> (riscv64-linux-gnu-gcc -nostdlib -no-pie a.s)

addi a0,gp,st_value(var)-__global_pointer$
addi a0,gp,st_value(var)-__global_pointer$

There are some bugs, e.g. if I remove .space 8, it is not relaxable. If I use .space 4 and keep .size, it is not relaxable.

IMO this is the de facto reason that GP is unallocable.
Questions:

  1. Is GP relaxation useful? How much does it save?
  2. Shall we make GP allocatable? If so, callee-saved, temporary, or reserved for a custom ABI?
  3. If GP is made allocatable, how to handle the compatibility issue with GNU ld which performs GP relaxation?

For 1, I am strongly opinionated that it is not, and we should remove GP relaxation from GNU ld.
Main arguments:

(a) Global data access instructions typically take a very small portion of the code size and should not be a performance bottleneck. Having one more register at proposal (especially in RVC) likely provides more benefits.
(b) The relaxable range [__global_pointer$-0x800, __global_pointer$+0x800) is too small. On the other side, if some embedded usage really thinks GP relaxation is useful, they can perform GP relaxation in their custom ABI.

For 3, if a new relocatable object file uses GP for a different purpose, it can use a new relocation type to indicate the use which is incompatible with a linker performing GP relaxation. An old linker will report an error for the unrecognized relocation type.

Note: The resolution to #205 removed R_RISCV_GPREL_I/R_RISCV_GPREL_S used internally by GNU ld.

@jrtc27
Copy link
Collaborator

jrtc27 commented Jul 9, 2022

GP relaxation is too entrenched in the GNU world to break; that'd be one major ABI break. Whether we think it's a good idea to have done or not it's what we have and have to deal with.

@MaskRay
Copy link
Collaborator Author

MaskRay commented Jul 9, 2022

GP relaxation is too entrenched in the GNU world to break; that'd be one major ABI break. Whether we think it's a good idea to have done or not it's what we have and have to deal with.

3 is indeed a problem. We can remove GP relaxation from GNU ld now to make the compatibility problem less severe.

@aswaterman
Copy link
Contributor

Patches to remove it from GNU ld will not be accepted. gp relaxation reduces code size and improves performance. Getting rid of the feature doesn’t free up gp for other purposes, because doing so would break the ABI. It would turn gp into an unusable register, which benefits no one.

@jrtc27 jrtc27 closed this as completed Jul 9, 2022
@jrtc27
Copy link
Collaborator

jrtc27 commented Jul 9, 2022

We're stuck with GP relaxation whether people want it or not, so whilst you can debate its merits as much as you like, the spec is never going to break the ABI that has existed and been in widespread use for many years.

@MaskRay
Copy link
Collaborator Author

MaskRay commented Jul 9, 2022

Patches to remove it from GNU ld will not be accepted. gp relaxation reduces code size and improves performance. Getting rid of the feature doesn’t free up gp for other purposes, because doing so would break the ABI. It would turn gp into an unusable register, which benefits no one.

I think it is worth measuring the amount of code size reduction and performance gain to back up the subjective claim.
GP does not need to be made allocable on day one. It can be changed to "reserved". I don't think this breaks ABI. When usage comes, further attempts can be made to use it in a way not breaking the widespread use.

If a new relocatable object file uses GP, it is incompatible with ld's GP relaxation. There is no inherent ABI incompatibility. Technically, ld can detect gp usage and disable GP relaxation if any relocatable object file uses GP. But the simpler approach is to remove GP relaxation if it turns out to be not so useful. A simple way to reject new relocatable object files for an old GNU ld is to use a new relocation type.

@aswaterman
Copy link
Contributor

aswaterman commented Jul 9, 2022

It reduces dynamic instruction count by approx. 5% on Dhrystone. We all know that's a terrible benchmark, but nevertheless it's frequently used in comparing ISAs, compilers, etc., and so we wouldn't sacrifice a few percent to satisfy what amounts to an aesthetic concern.

For a more useful data point, the data in my dissertation indicates that, in SPEC CPU2006, gp accounts for around 1% of static register references, and it's referenced more often than s8, s9, s10, s11, t3, t4, t5, or t6. Frequency of reference isn't directly indicative of code size, of course, but the two metrics track pretty closely. The marginal benefit of one more s-register or one more t-register just isn't that significant in an already-register-rich ISA.

@MaskRay
Copy link
Collaborator Author

MaskRay commented Jul 10, 2022

I wish that we keep the discussion at least a bit longer, instead of quickly jumping into the conclusion (i.e. closing this issue) with arguments like "doing so would break the ABI" "Patches to remove it from GNU ld will not be accepted", especially after I made a comment (also edited the top comment to mention it) that a relocation type can detect linker incompatibility.
(For instance, there is the unclosed #128 which mentions GP as well, though I don't spend more time to investigate whether it needs to reserve it. This at least demonstrates a possibility that future usage may benefit, beyond having a temporary or callee-saved register)

Any rate, compilers and use cases have evolved a lot since 2016. It's worth having fresh measurement on not only a (We all know that's) "terrible benchmark" but also others.
https://github.com/MaskRay/binutils-gdb/tree/riscv-relax-gp adds --no-relax-gp to allow disabling relaxation against the global pointer.
I may post it on the binutils mailing list at some point.

Cc some folks for more thoughts: @topperc, @preames

@kito-cheng
Copy link
Collaborator

kito-cheng commented Jul 11, 2022

I could imagine why you propose this aggressive idea here, PIE is default for several linux distro, and clang also default to PIE recently (by you :P https://reviews.llvm.org/D120305), so the GP relaxation is become more less and less useful in non-embedded world.

So I think the thread is more like: forgot ABI breakage and let brain storming what if GP can be released from the current GP relaxation scheme: does it possible to improve performance or code size by another way? something like:

  • Release that as a general purpose register.
  • Improve PLT entry by GP register (e.g auipc + ld + jalr -> ld with gp + jalr)
  • ...

I am happy to reopen that to discuss the possibility of the GP, but just keep in mind: we won't break ABI breakage or accept a new ABI variant without good argument and benefit :)

@kito-cheng kito-cheng reopened this Jul 11, 2022
@jrtc27
Copy link
Collaborator

jrtc27 commented Jul 11, 2022

GP relaxation works just as well for PIEs as PDEs, I don't know why it's not done if it's not, probably because people keep confusing bfd_link_pic with bfd_link_dll as it includes PIEs, not just DSOs. Far from the first time binutils code gets that wrong, and won't be the last.

@aswaterman
Copy link
Contributor

Agreed with @jrtc27; gp can still be beneficial for PIE

@Nelson1225
Copy link
Collaborator

I don't get a very strong reason that we should add options in ld to disable/enable the gp relaxations, since no harm if we do the gp relaxations. I think if the gp relaxations break something or make the performance worse, then close it is probably make sense. But I don't see any case for now, so don't understand that why we need to add option to close it.

@jnk0le
Copy link

jnk0le commented Nov 19, 2022

I think that there is an actual use case for making gp (together with tp) allocatable as e.g. another temporary register:

<=4KiB of total SRAM mapped around 0x00000000 so the x0 becomes defacto a new gp register and the current one is redundant. (no null pointer dereference trap though)

Of course it's going to be an ILP32E/EABI exclusive thing.

@aswaterman
Copy link
Contributor

That’s esoteric

@X547
Copy link

X547 commented Dec 9, 2022

Haiku operating system (https://www.haiku-os.org/) currently never use GP register and do not define __global_pointer symbol because all Haiku executables are linked with -shared flag motivating by following reasons:

  • Haiku have its own TLS block layout inherited from BeOS that is not compatible with ELF spec, so local/initial exec TLS model is not supported and will cause crashes.
  • Statically linked executables are not supported. Haiku ABI is defined at shared library level, not syscall level. Syscalls are considered unstable and can change in minor updates.
  • System API frameworks expect that is is possible to load executables dynamically by an API like dlopen() for serializing/deserializing objects.
  • It simplify things a lot and reduce chance of various weird problems.

It may be beneficial to reassign GP register for some different purpose on Haiku.

MaskRay added a commit to MaskRay/binutils-gdb that referenced this issue Feb 20, 2023
--relax enables all relaxations. --no-relax-gp disables GP relaxation to
allow measuring its effect.

Link: riscv-non-isa/riscv-elf-psabi-doc#298

bfd/
    * elfnn-riscv.c (struct riscv_elf_link_hash_table): Add params.
    (riscv_elfNN_set_options): New.
    (riscv_info_to_howto_rela): Check relax_gp.
    (_bfd_riscv_relax_section): Likewise.
    * elfxx-riscv.h (struct riscv_elf_params): New.
    (riscv_elf32_set_options): New.
    (riscv_elf64_set_options): New.
ld/
    * emultempl/riscvelf.em: Add option parsing.
    * testsuite/ld-riscv-elf/code-model-relax-medlow-01-norelaxgp.d: New.
    * testsuite/ld-riscv-elf/pcgp-relax-01-norelaxgp.d: New.
    * testsuite/ld-riscv-elf/pcgp-relax-02.d: Test --relax --relax-gp can be
      used together.
MaskRay added a commit to MaskRay/binutils-gdb that referenced this issue Feb 20, 2023
--relax enables all relaxations. --no-relax-gp disables GP relaxation to
allow measuring its effect.

Link: riscv-non-isa/riscv-elf-psabi-doc#298

bfd/
    * elfnn-riscv.c (struct riscv_elf_link_hash_table): Add params.
    (riscv_elfNN_set_options): New.
    (riscv_info_to_howto_rela): Check relax_gp.
    (_bfd_riscv_relax_section): Likewise.
    * elfxx-riscv.h (struct riscv_elf_params): New.
    (riscv_elf32_set_options): New.
    (riscv_elf64_set_options): New.
ld/
    * emultempl/riscvelf.em: Add option parsing.
    * testsuite/ld-riscv-elf/code-model-relax-medlow-01-norelaxgp.d: New.
    * testsuite/ld-riscv-elf/pcgp-relax-01-norelaxgp.d: New.
    * testsuite/ld-riscv-elf/pcgp-relax-02.d: Test --relax --relax-gp can be
      used together.
MaskRay added a commit to MaskRay/binutils-gdb that referenced this issue Feb 20, 2023
--relax enables all relaxations. --no-relax-gp disables GP relaxation to
allow measuring its effect.

Link: riscv-non-isa/riscv-elf-psabi-doc#298

bfd/
    * elfnn-riscv.c (struct riscv_elf_link_hash_table): Add params.
    (riscv_elfNN_set_options): New.
    (riscv_info_to_howto_rela): Check relax_gp.
    (_bfd_riscv_relax_section): Likewise.
    * elfxx-riscv.h (struct riscv_elf_params): New.
    (riscv_elf32_set_options): New.
    (riscv_elf64_set_options): New.
ld/
    * emultempl/riscvelf.em: Add option parsing.
    * testsuite/ld-riscv-elf/code-model-relax-medlow-01-norelaxgp.d: New.
    * testsuite/ld-riscv-elf/pcgp-relax-01-norelaxgp.d: New.
    * testsuite/ld-riscv-elf/pcgp-relax-02.d: Test --relax --relax-gp can be
      used together.
MaskRay added a commit to MaskRay/binutils-gdb that referenced this issue Feb 23, 2023
--relax enables all relaxations.  --no-relax-gp disables GP relaxation to
allow measuring its effect.

The option can test effectiveness of GP relaxation and support some ABI
variants that use GP for other purposes.

Link: riscv-non-isa/riscv-elf-psabi-doc#298

bfd/
    * elfnn-riscv.c (struct riscv_elf_link_hash_table): Add params.
    (riscv_elfNN_set_options): New.
    (riscv_info_to_howto_rela): Check relax_gp.
    (_bfd_riscv_relax_section): Likewise.
    * elfxx-riscv.h (struct riscv_elf_params): New.
    (riscv_elf32_set_options): New.
    (riscv_elf64_set_options): New.
ld/
    * emultempl/riscvelf.em: Add option parsing.
    * testsuite/ld-riscv-elf/code-model-relax-medlow-01-norelaxgp.d: New.
    * testsuite/ld-riscv-elf/pcgp-relax-01-norelaxgp.d: New.
    * testsuite/ld-riscv-elf/pcgp-relax-02.d: Test --relax --relax-gp can be
      used together.
MaskRay added a commit to MaskRay/binutils-gdb that referenced this issue Feb 24, 2023
--relax enables all relaxations.  --no-relax-gp disables GP relaxation to
allow measuring its effect.

The option can test effectiveness of GP relaxation and support some ABI
variants that use GP for other purposes.

Link: riscv-non-isa/riscv-elf-psabi-doc#298

bfd/
    * elfnn-riscv.c (struct riscv_elf_link_hash_table): Add params.
    (riscv_elfNN_set_options): New.
    (riscv_info_to_howto_rela): Check relax_gp.
    (_bfd_riscv_relax_section): Likewise.
    * elfxx-riscv.h (struct riscv_elf_params): New.
    (riscv_elf32_set_options): New.
    (riscv_elf64_set_options): New.
ld/
    * emultempl/riscvelf.em: Add option parsing.
    * testsuite/ld-riscv-elf/code-model-relax-medlow-01-norelaxgp.d: New.
    * testsuite/ld-riscv-elf/pcgp-relax-01-norelaxgp.d: New.
    * testsuite/ld-riscv-elf/pcgp-relax-02.d: Test --relax --relax-gp can be
      used together.
kito-cheng added a commit that referenced this issue Mar 22, 2023
… linker relaxation

The usage of gp register has discussed serveral times (e.g. #298), and it
reserved as special register used for linker relaxation, that could be
improve perfomance and code size.

However it come with some limitation, like it can't applicable on shared
libraries, and it also not work well when program come with large datas
since the relaxable range is only +-2KiB.

Some platform like Haiku never use gp in the whole system, and also this
might not useful in some baremetal system with specialized memory layout,
so we might consider to release the gp usage with a non-hard-ABI-breakage way.

This proposal also adding a tag for preventing mix objects with different gp
register usage.
kito-cheng added a commit that referenced this issue Apr 4, 2023
… linker relaxation

The usage of gp register has discussed serveral times (e.g. #298), and it
reserved as special register used for linker relaxation, that could be
improve perfomance and code size.

However it come with some limitation, like it can't applicable on shared
libraries, and it also not work well when program come with large datas
since the relaxable range is only +-2KiB.

Some platform like FreeBSD and Haiku never use gp in the whole system, and also
this might not useful in some baremetal system with specialized memory layout,
so we might consider to release the gp usage with a non-hard-ABI-breakage way.

This proposal also adding a tag for preventing mix objects with different gp
register usage.
kito-cheng added a commit that referenced this issue Apr 7, 2023
… linker relaxation

The usage of gp register has discussed serveral times (e.g. #298), and it
reserved as special register used for linker relaxation, that could be
improve perfomance and code size.

However it come with some limitation, like it can't applicable on shared
libraries, and it also not work well when program come with large datas
since the relaxable range is only +-2KiB.

Some platform like FreeBSD and Haiku never use gp in the whole system, and also
this might not useful in some baremetal system with specialized memory layout,
so we might consider to release the gp usage with a non-hard-ABI-breakage way.

Co-authored-by: Alex Bradbury <asb@asbradbury.org>
@appujee
Copy link

appujee commented Apr 26, 2023

Haiku operating system (https://www.haiku-os.org/) currently never use GP register and do not define __global_pointer symbol because all Haiku executables are linked with -shared flag motivating by following reasons:

  • Haiku have its own TLS block layout inherited from BeOS that is not compatible with ELF spec, so local/initial exec TLS model is not supported and will cause crashes.
  • Statically linked executables are not supported. Haiku ABI is defined at shared library level, not syscall level. Syscalls are considered unstable and can change in minor updates.
  • System API frameworks expect that is is possible to load executables dynamically by an API like dlopen() for serializing/deserializing objects.
  • It simplify things a lot and reduce chance of various weird problems.

It may be beneficial to reassign GP register for some different purpose on Haiku.

gp can now be used for shadow call stack thereby freeing a reserved register (X18): #370

@kito-cheng
Copy link
Collaborator

Close via #371

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants