Skip to content

Commit

Permalink
Auto merge of #113592 - Kobzol:pgo-script-bolt, r=Mark-Simulacrum
Browse files Browse the repository at this point in the history
Move BOLT from `bootstrap` to `opt-dist`

Currently, we use BOLT to optimize LLVM for x64 Linux. The BOLT instrumentation and optimization step is implemented in `bootstrap`, but it was always quite hacky, because BOLT works quite differently than PGO. Rather than building an instrumented artifact, it takes an already built artifact and instruments it in-place. This is not a good fit for the bootstrap caching mechanism, and it meant that we had to run BOLT "on-the-fly" when packaging LLVM artifacts into the created sysroot.

The BOLT code was also really only used by the PGO script (now called `opt-dist`) and nothing else, so it was quite hardcoded for this one single usage. And even if someone wanted to use the `--llvm-bolt-profile-[use/generate]` bootstrap flags outside of the PGO script, they would also need to implement profile gathering, as this is not performed by bootstrap anyway.

I think that it could be more practical to move the BOLT logic to the `opt-dist` tool instead. This simplifies bootstrap, removes unneeded implementation of BOLT caching (we will now do it exactly once - no need to check if it has been performed already when bootstrap copies `libLLVM.so` around multiple times) and removes two BOLT-specific bootstrap flags, and also one special case for building LLVM (instead I pass the linker flags with `--set llvm.ldflags` from `opt-dist` now).

There are also a few disadvantages to this new approach:
- We have to guess/find the path to the built `libLLVM.so` file (but currently this is quite easy, it's just in `stage2/lib`).
- We have to provide the BOLT profile externally to bootstrap, so that it is packaged into the reproducible artifacts archive. Doesn't seem like a big deal to me.
- We are depending on some inner behavior of boostrap in `opt-dist` (namely, that `libLLVM.so` is hardlinked). But we do that for many other things in the `opt-dist` tool anyway, it's tied quite closely to bootstrap.

I would like to go back to my attempts to also use BOLT for `librustc_driver.so`, and I think that it might be a bit simpler if I also do it from the `opt-dist` tool, so this is the first step towards that.

Anyway, let me know what you think about this. It's just a refactoring in a way, no big deal.

r? bootstrap
  • Loading branch information
bors committed Jul 31, 2023
2 parents b321edd + 142995f commit 3114eb1
Show file tree
Hide file tree
Showing 16 changed files with 225 additions and 314 deletions.
1 change: 1 addition & 0 deletions Cargo.lock
Original file line number Diff line number Diff line change
Expand Up @@ -2495,6 +2495,7 @@ dependencies = [
"serde_json",
"sysinfo",
"tar",
"tempfile",
"xz",
"zip",
]
Expand Down
62 changes: 0 additions & 62 deletions src/bootstrap/bolt.rs

This file was deleted.

15 changes: 4 additions & 11 deletions src/bootstrap/config.rs
Original file line number Diff line number Diff line change
Expand Up @@ -232,8 +232,8 @@ pub struct Config {
pub llvm_profile_use: Option<String>,
pub llvm_profile_generate: bool,
pub llvm_libunwind_default: Option<LlvmLibunwind>,
pub llvm_bolt_profile_generate: bool,
pub llvm_bolt_profile_use: Option<String>,

pub reproducible_artifacts: Vec<String>,

pub build: TargetSelection,
pub hosts: Vec<TargetSelection>,
Expand Down Expand Up @@ -1127,15 +1127,6 @@ impl Config {
config.free_args = std::mem::take(&mut flags.free_args);
config.llvm_profile_use = flags.llvm_profile_use;
config.llvm_profile_generate = flags.llvm_profile_generate;
config.llvm_bolt_profile_generate = flags.llvm_bolt_profile_generate;
config.llvm_bolt_profile_use = flags.llvm_bolt_profile_use;

if config.llvm_bolt_profile_generate && config.llvm_bolt_profile_use.is_some() {
eprintln!(
"Cannot use both `llvm_bolt_profile_generate` and `llvm_bolt_profile_use` at the same time"
);
exit!(1);
}

// Infer the rest of the configuration.

Expand Down Expand Up @@ -1471,6 +1462,8 @@ impl Config {
config.rust_profile_generate = flags.rust_profile_generate;
}

config.reproducible_artifacts = flags.reproducible_artifact;

// rust_info must be set before is_ci_llvm_available() is called.
let default = config.channel == "dev";
config.omit_git_hash = omit_git_hash.unwrap_or(default);
Expand Down
131 changes: 3 additions & 128 deletions src/bootstrap/dist.rs
Original file line number Diff line number Diff line change
Expand Up @@ -18,9 +18,7 @@ use std::process::Command;

use object::read::archive::ArchiveFile;
use object::BinaryFormat;
use sha2::Digest;

use crate::bolt::{instrument_with_bolt, optimize_with_bolt};
use crate::builder::{Builder, Kind, RunConfig, ShouldRun, Step};
use crate::cache::{Interned, INTERNER};
use crate::channel;
Expand Down Expand Up @@ -1941,19 +1939,7 @@ fn install_llvm_file(builder: &Builder<'_>, source: &Path, destination: &Path) {
return;
}

// After LLVM is built, we modify (instrument or optimize) the libLLVM.so library file.
// This is not done in-place so that the built LLVM files are not "tainted" with BOLT.
// We perform the instrumentation/optimization here, on the fly, just before they are being
// packaged into some destination directory.
let postprocessed = if builder.config.llvm_bolt_profile_generate {
builder.ensure(BoltInstrument::new(source.to_path_buf()))
} else if let Some(path) = &builder.config.llvm_bolt_profile_use {
builder.ensure(BoltOptimize::new(source.to_path_buf(), path.into()))
} else {
source.to_path_buf()
};

builder.install(&postprocessed, destination, 0o644);
builder.install(&source, destination, 0o644);
}

/// Maybe add LLVM object files to the given destination lib-dir. Allows either static or dynamic linking.
Expand Down Expand Up @@ -2038,117 +2024,6 @@ pub fn maybe_install_llvm_runtime(builder: &Builder<'_>, target: TargetSelection
}
}

/// Creates an output path to a BOLT-manipulated artifact for the given `file`.
/// The hash of the file is used to make sure that we don't mix BOLT artifacts amongst different
/// files with the same name.
///
/// We need to keep the file-name the same though, to make sure that copying the manipulated file
/// to a directory will not change the final file path.
fn create_bolt_output_path(builder: &Builder<'_>, file: &Path, hash: &str) -> PathBuf {
let directory = builder.out.join("bolt").join(hash);
t!(fs::create_dir_all(&directory));
directory.join(file.file_name().unwrap())
}

/// Instrument the provided file with BOLT.
/// Returns a path to the instrumented artifact.
#[derive(Clone, Debug, Eq, Hash, PartialEq)]
pub struct BoltInstrument {
file: PathBuf,
hash: String,
}

impl BoltInstrument {
fn new(file: PathBuf) -> Self {
let mut hasher = sha2::Sha256::new();
hasher.update(t!(fs::read(&file)));
let hash = hex::encode(hasher.finalize().as_slice());

Self { file, hash }
}
}

impl Step for BoltInstrument {
type Output = PathBuf;

const ONLY_HOSTS: bool = false;
const DEFAULT: bool = false;

fn should_run(run: ShouldRun<'_>) -> ShouldRun<'_> {
run.never()
}

fn run(self, builder: &Builder<'_>) -> PathBuf {
if builder.build.config.dry_run() {
return self.file.clone();
}

if builder.build.config.llvm_from_ci {
println!("warning: trying to use BOLT with LLVM from CI, this will probably not work");
}

println!("Instrumenting {} with BOLT", self.file.display());

let output_path = create_bolt_output_path(builder, &self.file, &self.hash);
if !output_path.is_file() {
instrument_with_bolt(&self.file, &output_path);
}
output_path
}
}

/// Optimize the provided file with BOLT.
/// Returns a path to the optimized artifact.
///
/// The hash is stored in the step to make sure that we don't optimize the same file
/// twice (even under different file paths).
#[derive(Clone, Debug, Eq, Hash, PartialEq)]
pub struct BoltOptimize {
file: PathBuf,
profile: PathBuf,
hash: String,
}

impl BoltOptimize {
fn new(file: PathBuf, profile: PathBuf) -> Self {
let mut hasher = sha2::Sha256::new();
hasher.update(t!(fs::read(&file)));
hasher.update(t!(fs::read(&profile)));
let hash = hex::encode(hasher.finalize().as_slice());

Self { file, profile, hash }
}
}

impl Step for BoltOptimize {
type Output = PathBuf;

const ONLY_HOSTS: bool = false;
const DEFAULT: bool = false;

fn should_run(run: ShouldRun<'_>) -> ShouldRun<'_> {
run.never()
}

fn run(self, builder: &Builder<'_>) -> PathBuf {
if builder.build.config.dry_run() {
return self.file.clone();
}

if builder.build.config.llvm_from_ci {
println!("warning: trying to use BOLT with LLVM from CI, this will probably not work");
}

println!("Optimizing {} with BOLT", self.file.display());

let output_path = create_bolt_output_path(builder, &self.file, &self.hash);
if !output_path.is_file() {
optimize_with_bolt(&self.file, &self.profile, &output_path);
}
output_path
}
}

#[derive(Clone, Debug, Eq, Hash, PartialEq)]
pub struct LlvmTools {
pub target: TargetSelection,
Expand Down Expand Up @@ -2390,8 +2265,8 @@ impl Step for ReproducibleArtifacts {
tarball.add_file(path, ".", 0o644);
added_anything = true;
}
if let Some(path) = builder.config.llvm_bolt_profile_use.as_ref() {
tarball.add_file(path, ".", 0o644);
for profile in &builder.config.reproducible_artifacts {
tarball.add_file(profile, ".", 0o644);
added_anything = true;
}
if added_anything { Some(tarball.generate()) } else { None }
Expand Down
7 changes: 2 additions & 5 deletions src/bootstrap/flags.rs
Original file line number Diff line number Diff line change
Expand Up @@ -149,12 +149,9 @@ pub struct Flags {
/// generate PGO profile with llvm built for rustc
#[arg(global(true), long)]
pub llvm_profile_generate: bool,
/// generate BOLT profile for LLVM build
/// Additional reproducible artifacts that should be added to the reproducible artifacts archive.
#[arg(global(true), long)]
pub llvm_bolt_profile_generate: bool,
/// use BOLT profile for LLVM build
#[arg(global(true), value_hint = clap::ValueHint::FilePath, long, value_name = "PROFILE")]
pub llvm_bolt_profile_use: Option<String>,
pub reproducible_artifact: Vec<String>,
#[arg(global(true))]
/// paths for the subcommand
pub paths: Vec<PathBuf>,
Expand Down
1 change: 0 additions & 1 deletion src/bootstrap/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,6 @@ use crate::util::{
dir_is_empty, exe, libdir, mtime, output, run, run_suppressed, symlink_dir, try_run_suppressed,
};

mod bolt;
mod builder;
mod cache;
mod cc_detect;
Expand Down
6 changes: 0 additions & 6 deletions src/bootstrap/llvm.rs
Original file line number Diff line number Diff line change
Expand Up @@ -342,12 +342,6 @@ impl Step for Llvm {
if let Some(path) = builder.config.llvm_profile_use.as_ref() {
cfg.define("LLVM_PROFDATA_FILE", &path);
}
if builder.config.llvm_bolt_profile_generate
|| builder.config.llvm_bolt_profile_use.is_some()
{
// Relocations are required for BOLT to work.
ldflags.push_all("-Wl,-q");
}

// Disable zstd to avoid a dependency on libzstd.so.
cfg.define("LLVM_ENABLE_ZSTD", "OFF");
Expand Down
Loading

0 comments on commit 3114eb1

Please sign in to comment.