Skip to content

Commit

Permalink
Auto merge of #54592 - GabrielMajeri:no-plt, r=<try>
Browse files Browse the repository at this point in the history
[WIP] Support for disabling PLT for better function call performance

This PR gives `rustc` the ability to skip the PLT when generating function calls into shared libraries. This can improve performance by reducing branch indirection.

AFAIK, the only advantage of using the PLT is to allow for ELF lazy binding. However, since Rust already [enables full relro for security](#43170), lazy binding was disabled anyway.

This is a little known feature which is supported by [GCC](https://gcc.gnu.org/onlinedocs/gcc/Code-Gen-Options.html) and [Clang](https://clang.llvm.org/docs/ClangCommandLineReference.html#cmdoption-clang-fplt) as `-fno-plt` (some Linux distros [enable it by default](https://git.archlinux.org/svntogit/packages.git/tree/trunk/makepkg.conf?h=packages/pacman#n40) for all builds).

Implementation inspired by [this patch](https://reviews.llvm.org/D39079#change-YvkpNDlMs_LT) which adds `-fno-plt` support to Clang.

## Performance

I didn't run a lot of benchmarks, but these are the results on my machine for a `clap` [benchmark](https://github.com/clap-rs/clap/blob/master/benches/05_ripgrep.rs):

```
 name              control ns/iter  no-plt ns/iter  diff ns/iter  diff %  speedup
 build_app_long    11,097           10,733                  -364  -3.28%   x 1.03
 build_app_short   11,089           10,742                  -347  -3.13%   x 1.03
 build_help_long   186,835          182,713               -4,122  -2.21%   x 1.02
 build_help_short  80,949           78,455                -2,494  -3.08%   x 1.03
 parse_clean       12,385           12,044                  -341  -2.75%   x 1.03
 parse_complex     19,438           19,017                  -421  -2.17%   x 1.02
 parse_lots        431,493          421,421              -10,072  -2.33%   x 1.02
```

A small performance improvement across the board, with no downsides. It's likely binaries which make a lot of function calls into dynamic libraries could see even more improvements. [This comment](https://patchwork.ozlabs.org/patch/468993/#1028255) suggests that, in some cases, `-fno-plt` could improve PIC/PIE code performance by 10%.

## To do

- [ ] Do a perf run to see the effect this has on the compiler (cc @michaelwoerister),
  and possibly run benchmarks on some more crates

- [ ] Add a code gen test

- [ ] Should this be always enabled or should it be behind a command line option?
  If so, what should it be called? `-Z no-plt`? `-Z plt=no`?
  • Loading branch information
bors committed Sep 26, 2018
2 parents 6846f22 + ddf98c1 commit 5747631
Show file tree
Hide file tree
Showing 4 changed files with 11 additions and 1 deletion.
8 changes: 7 additions & 1 deletion src/librustc_codegen_llvm/attributes.rs
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ use rustc::ty::layout::HasTyCtxt;
use rustc::ty::query::Providers;
use rustc_data_structures::sync::Lrc;
use rustc_data_structures::fx::FxHashMap;
use rustc_target::spec::PanicStrategy;
use rustc_target::spec::{PanicStrategy, RelroLevel};

use attributes;
use llvm::{self, Attribute};
Expand Down Expand Up @@ -174,6 +174,12 @@ pub fn from_fn_attrs(
set_frame_pointer_elimination(cx, llfn);
set_probestack(cx, llfn);

// Only enable this optimization if full relro is also enabled.
// In this case, lazy binding was already unavailable, so nothing is lost.
if let RelroLevel::Full = cx.sess().target.target.options.relro_level {
Attribute::NonLazyBind.apply_llfn(Function, llfn);
}

if codegen_fn_attrs.flags.contains(CodegenFnAttrFlags::COLD) {
Attribute::Cold.apply_llfn(Function, llfn);
}
Expand Down
1 change: 1 addition & 0 deletions src/librustc_codegen_llvm/llvm/ffi.rs
Original file line number Diff line number Diff line change
Expand Up @@ -122,6 +122,7 @@ pub enum Attribute {
SanitizeThread = 20,
SanitizeAddress = 21,
SanitizeMemory = 22,
NonLazyBind = 23,
}

/// LLVMIntPredicate
Expand Down
2 changes: 2 additions & 0 deletions src/rustllvm/RustWrapper.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -178,6 +178,8 @@ static Attribute::AttrKind fromRust(LLVMRustAttribute Kind) {
return Attribute::SanitizeAddress;
case SanitizeMemory:
return Attribute::SanitizeMemory;
case NonLazyBind:
return Attribute::NonLazyBind;
}
report_fatal_error("bad AttributeKind");
}
Expand Down
1 change: 1 addition & 0 deletions src/rustllvm/rustllvm.h
Original file line number Diff line number Diff line change
Expand Up @@ -97,6 +97,7 @@ enum LLVMRustAttribute {
SanitizeThread = 20,
SanitizeAddress = 21,
SanitizeMemory = 22,
NonLazyBind = 23,
};

typedef struct OpaqueRustString *RustStringRef;
Expand Down

0 comments on commit 5747631

Please sign in to comment.