Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Define strategy for upstreaming our Rust and LLVM forks #21483

Closed
alessandrod opened this issue Nov 29, 2021 · 21 comments
Closed

Define strategy for upstreaming our Rust and LLVM forks #21483

alessandrod opened this issue Nov 29, 2021 · 21 comments
Labels
stale [bot only] Added to stale content; results in auto-close after a week.

Comments

@alessandrod
Copy link
Contributor

Problem

Solana currently maintains a fork of the rust compiler and a fork of LLVM. We want to define a strategy to upstream both.

Proposed Solution

Solana has diverged from eBPF and is planning to diverge even further. #20323 discusses the kind of changes we're planning to introduce to the new bytecode format, SBFv2.

We should also decide how we're planning to upstream SBFv2, so that at some point we can stop maintaining separate Rust and LLVM forks. We've discussed bits of this among some of us in a few places, I figured it'd be good to write it all down in one place. Since I don't want to derail the conversation in #20323, I'm creating a new issue here.

The present

We have a Rust fork at https://github.com/solana-labs/rust. The fork defines a new target named bpfel-unknown-unknown, which is in process of being renamed to sbf-solana-solana (#19113), and includes a number of changes to libcore and libstd, among others.

The target depends on our LLVM fork at https://github.com/solana-labs/llvm-project. The fork includes changes to the existing BPF target, some of which could potentially be upstreamed, and some of which are fundamentally not acceptable by the upstream BPF target developers.

There are probably a few, but for example one change that I don't think is upstreamable is anza-xyz/llvm-project@733f493.

Since Rust has a policy that targets must work with upstream LLVM, and since (I'm fairly sure) bpfel-unknown-unknown wouldn't be able to compile core and std without the above change, as things stand today I think bpfel-unknown-unknown is effectively not upstreamable.

The possible futures!

New SBF target in LLVM

This is what I think we've decided to do now. We're going to create a new LLVM target called SBF. SBF is (probably) going to start as a fork of the BPF target, but we're open to doing something new entirely (this is being discussed in #20323).

We are then going to make the sbf-solana-solana rust backend use SBF. Because again rust targets must work with upstream LLVM, before we can upstream the rust target we must upstream SBF to LLVM.

LLVM's policy for adding new targets is at https://llvm.org/docs/DeveloperPolicy.html#adding-a-new-target. Skimming through it, I don't see any blockers to getting SBF upstream. Before we embark on this journey tho, I think it would be good to engage with the LLVM developers and see how they feel about it.

This is in many ways the ideal future, but I do expect upstreaming the new target to take a significant amount of time (several months), which means that for the foreseeable future, we'd have to keep maintaining our rust fork.

Make sbf-solana-solana work like the upstream bpf*-unknown-none targets

If we wanted to upstream sbf-solana-solana earlier, there might be an ad-interim option that involves making the target work like the upstream bpf*-unknown-none targets.

The way the bpf*-unknown-none targets manage to avoid codegen issues is by not doing codegen in rustc at all. Those targets will only produce LLVM bitcode instead, then hand it to https://github.com/aya-rs/bpf-linker, which will massage the IR before doing codegen. bpf-linker can either use LLVM provided by rust, or use a custom LLVM fork.

We could have sbf-solana-solana do the same thing, produce LLVM IR then invoke a tool (maybe even a fork of bpf-linker) to do code generation. The tool would initially link (statically?) to our LLVM fork. Perhaps we could even do this in parallel to upstreaming SBF in LLVM, and then once upstream, we could switch the tool to using LLVM provided by rust (or get rid of the tool altogether).

...more options?

@seanyoung
Copy link
Contributor

Depending on how different SBFv2 ends up being from BPF, it might be possible to add SBFv2 as a subtarget of the llvm bpf target. If at all possible, this might be more palatable to the llvm maintainers, and this will also be less maintenance work once it is merged.

@seanyoung
Copy link
Contributor

In fact, solana-flavoured-bpf could be added a subtarget of the llvm target today, no need to wait for SBFv2. I don't know if this is worth the engineering effort though.

@alessandrod
Copy link
Contributor Author

In fact, solana-flavoured-bpf could be added a subtarget of the llvm target today, no need to wait for SBFv2. I don't know if this is worth the engineering effort though.

Technically we do have a subtarget, the question is whether we can upstream it. I think there were some preliminary discussions with the upstream BPF developers and they didn't seem too keen.

@seanyoung
Copy link
Contributor

Technically we do have a subtarget, the question is whether we can upstream it. I think there were some preliminary discussions with the upstream BPF developers and they didn't seem too keen.

Is that discussion public?

@jackcmay
Copy link
Contributor

jackcmay commented Nov 29, 2021

I couldn't find any direct conversation about this with the BPF folks (doesn't mean there wasn't one) but I think I heard it through he grape vine too.

@jackcmay
Copy link
Contributor

I don't see any particular reason to rush the upstreaming and generate additional work in the interim.

@dmakarov
Copy link
Contributor

Indeed, we already added SubtargetFeature "solana" to the BPF target in llvm. All our llvm changes are enabled only when the feature is enabled by a command line option. I added a new triplet sbf-solana-solana (only on my fork's branch at the moment), which acts as an alias for bpfle-unknown-unknown. I'll make LLVM enable solana subtarget feature when the sbf-solana-solana triplet is used. Maybe then the first step in submitting our changes upstream could be trying to get the subtarget accepted? However, my concern is if we have the subtarget accepted to upstream, but later create a separate target SBF, it will be more difficult to get the new target accepted (maybe not).

A note: we'll probably always build SBF tools from our own forks of rust-lang/rust and rust-lang/llvm-project, in the same way rustc is built using the rust-lang/llvm-project fork of llvm.org's llvm-project. Upstreaming our changes will make it easier to maintain the forks, but in general there may always be some amount of differences between the source trees we use to build SBF tools and the upstream repositories.

@alessandrod
Copy link
Contributor Author

A note: we'll probably always build SBF tools from our own forks of rust-lang/rust and rust-lang/llvm-project, in the same way rustc is built using the rust-lang/llvm-project fork of llvm.org's llvm-project. Upstreaming our changes will make it easier to maintain the forks, but in general there may always be some amount of differences between the source trees we use to build SBF tools and the upstream repositories.

That might be fine, but keep in mind that a requirement to upstream the rust target is that the target is fully functional, so that people can use rustup cargo etc and develop working programs. So for example having experimental changes only in our fork while they mature would be ok, but ultimately we should strive to upstream everything.

@dmakarov
Copy link
Contributor

This is a good point. Yes the solana forks will be used as buffers for changes/fixes that eventually should be submitted to upstream.

@dmakarov
Copy link
Contributor

dmakarov commented Dec 1, 2021

Make sbf-solana-solana work like the upstream bpf*-unknown-none targets

We could have sbf-solana-solana do the same thing, produce LLVM IR then invoke a tool (maybe even a fork of bpf-linker) to do code generation. The tool would initially link (statically?) to our LLVM fork. Perhaps we could even do this in parallel to upstreaming SBF in LLVM, and then once upstream, we could switch the tool to using LLVM provided by rust (or get rid of the tool altogether).

I like this option. Some consideration: we have a nice build process, in which when rustc is built we also build clang/llvm toolchain. We support development of on-chain programs in C/C++ (with severe limitations, but nevertheless). So we maintain two toolchains (Rust and C/C++). I guess in our fork of rust-lang/rust we can continue using our fork of rust-lang/llvm-project even if it's not used by rustc to generate the BPF code after we switch to using bpf-linker, and so we don't have to change our build process (https://github.com/dmakarov/bpf-tools). Does bpf-linker invoke ld.lld as a final stage of compiling a loadable binary module? We also generate a custom linker script for --target bpf-unknown-unknown in rustc, when it invokes ld.lld for making our .so modules (sort of our RBPF executable modules).

@alessandrod
Copy link
Contributor Author

Make sbf-solana-solana work like the upstream bpf*-unknown-none targets

We could have sbf-solana-solana do the same thing, produce LLVM IR then invoke a tool (maybe even a fork of bpf-linker) to do code generation. The tool would initially link (statically?) to our LLVM fork. Perhaps we could even do this in parallel to upstreaming SBF in LLVM, and then once upstream, we could switch the tool to using LLVM provided by rust (or get rid of the tool altogether).

I like this option. Some consideration: we have a nice build process, in which when rustc is built we also build clang/llvm toolchain. We support development of on-chain programs in C/C++ (with severe limitations, but nevertheless). So we maintain two toolchains (Rust and C/C++). I guess in our fork of rust-lang/rust we can continue using our fork of rust-lang/llvm-project even if it's not used by rustc to generate the BPF code after we switch to using bpf-linker, and so we don't have to change our build process (https://github.com/dmakarov/bpf-tools).

Yep as long as we keep upstream rustc fully working, then we can do whatever we want to dist our toolchain.

Does bpf-linker invoke ld.lld as a final stage of compiling a loadable binary module? We also generate a custom linker script for --target bpf-unknown-unknown in rustc, when it invokes ld.lld for making our .so modules (sort of our RBPF executable modules).

bpf-linker does LTO across all the codegen units and emits an object file. We could either patch bpf-linker or write a new tool (doesn't really matter that much) and call ld.lld as the final step.

@jackcmay
Copy link
Contributor

jackcmay commented Dec 1, 2021

@dmakarov Are LTO and all the opt levels working now for our version of BPF?

@dmakarov
Copy link
Contributor

dmakarov commented Dec 1, 2021

We currently have LTO disabled because with LTO incorrect BPF code is produced in some cases (need investigation and fixing). The default optimization level in our toolchain is O3 (which is the maximal possible level of optimizations).

@alessandrod
Copy link
Contributor Author

We currently have LTO disabled because with LTO incorrect BPF code is produced in some cases (need investigation and fixing).

Do you have a case to reproduce the issue? I fixed a LTO issue while working on tests last week, might as well look into this other issue now that that part of the code is still loaded in my brain

@dmakarov
Copy link
Contributor

dmakarov commented Dec 1, 2021

If I remember correctly, running solana/scripts/build-downstream-projects.sh should fail on some test if -C lto=no is not added to RUSTFLAGS environment variable. We automatically add this option in cargo-build-bpf

let rustflags = match env::var("RUSTFLAGS") {
Ok(rf) => {
if rf.contains("-C lto=no") {
rf
} else {
rf + &" -C lto=no".to_string()
}
}
_ => "-C lto=no".to_string(),
};

@alessandrod
Copy link
Contributor Author

We could have sbf-solana-solana do the same thing, produce LLVM IR then invoke a tool (maybe even a fork of bpf-linker) to do code generation.

While learning more about our fork I realized that another good reason to push codegen to a later stage, making the target emit LLVM bitcode, is that we currently have a number of fixes that are required to compile, but are not actually functional. This includes locks, thread_local!() and threading code in general, anything that creates static muts or static Cells, etc. There are quite a few places in std that use those (eg HashMaps use thread_local!(), something I now fixed in anza-xyz/rust#41).

The issue with those "compile only" fixes is that they will realistically never be upstreamable, since they technically don't work. Also - and probably worse - they allow unsupported parts of std to compile correctly, to then fail only at runtime in rbpf.

If we moved to generating IR we could remove those changes and make the actual code generator fail if unsupported APIs are used, as opposed to failing at runtime. Or alternatively could keep the target as it is today, but compile std with -Z build-std and finish the upstream work to allow std to be pulled from crates.io. In that scenario we wouldn't compile std at all as part of building rustc for sbf-solana-solana. We would fork std, upload it on crates.io as sbf-std, and then we'd have complete freedom in optimizing/tweaking it for our needs.

@dmakarov
Copy link
Contributor

We had rust standard libraries as a separate repository https://github.com/solana-labs/rust-bpf-sysroot
It was difficult to update to new versions of rust. I think it would be better if we maintain the entire toolchain including libraries in a single repository. Maybe we can review the disabled library features and move all SBF specific stuff under sys/sbf/. Not all library features are supported on all platforms (wasm for example has some limitations too). If we could isolate unsupported features under sys/sbf and provide no-op implementations of unsupported features, maybe that would be acceptable?

@alessandrod
Copy link
Contributor Author

We had rust standard libraries as a separate repository https://github.com/solana-labs/rust-bpf-sysroot
It was difficult to update to new versions of rust. I think it would be better if we maintain the entire toolchain including libraries in a single repository

We would still develop our std fork in our rust repo, we'd just publish it on crates.io instead of pre-compiling it and putting it in the sysroot. The point of doing that would be that we could upstream our rustc changes fairly easily (there's nothing too controversial there), have the target work as no_std by default, then let people optionally pull our std fork (with controversial changes that I don't think can be upstreamed) from crates.io.

Not all library features are supported on all platforms (wasm for example has some limitations too). If we could isolate unsupported features under sys/sbf and provide no-op implementations of unsupported features, maybe that would be acceptable?

I think the problem is not that we don't support some std features, that's fine. The problem is that in order for std to just compile we must have some changes that please the compiler, but that are not actually functional. I think it's going to be hard to justify adding a bunch of #[cfg(target_arch = "sbf")] checks everywhere with code that doesn't even work. The wasm checks are different, they are custom to wasm yes, but they do work when executed in a wasm vm so they're more justifiable in my view.

Also something to keep in mind is that a few years ago wasm support was (rightly) identified as a key priority for the rust project. And rust is arguably the best toolchain to work with wasm today so the cost-benefit of having wasm specific code here and there is definitely worth it. But you don't see the same kind of checks for any other targets, so I don't think we can assume that we can get away with the same kind of invasive changes that the wasm backend has.

@dmakarov
Copy link
Contributor

If we go on the path of separating std and publishing it on crates.io, rust-std-sbf-solana-solana won't be available for installation via rustup as a component, correct? I guess, it's not an issue. I just wish we could simplify the installation of rust SBF toolchain.

Alessandro, your arguments make sense.

@alessandrod
Copy link
Contributor Author

If we go on the path of separating std and publishing it on crates.io, rust-std-sbf-solana-solana won't be available for installation via rustup as a component, correct? I guess, it's not an issue. I just wish we could simplify the installation of rust SBF toolchain.

Correct it won't be available with rustup, but it will be fetched automatically with cargo build, so the result would be the same. (To be clear there is some work that we'd need to do in cargo - cargo can build std with cargo build -Z build-std today, but it doesn't fetch it from crates.io, that part has been discussed in the past but not implemented yet.)

@github-actions github-actions bot added the stale [bot only] Added to stale content; results in auto-close after a week. label Dec 28, 2022
@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Jan 5, 2023
@dmakarov dmakarov reopened this Jan 5, 2023
@github-actions github-actions bot removed the stale [bot only] Added to stale content; results in auto-close after a week. label Jan 6, 2023
@mitchmindtree
Copy link

@alessandrod thanks for sharing all this! Any chance you could provide an update on the state of these upstreaming efforts?

@github-actions github-actions bot added the stale [bot only] Added to stale content; results in auto-close after a week. label Apr 25, 2024
@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale May 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stale [bot only] Added to stale content; results in auto-close after a week.
Projects
None yet
Development

No branches or pull requests

5 participants