-
Notifications
You must be signed in to change notification settings - Fork 12.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] parameterize -C prefer-dynamic
#88101
[WIP] parameterize -C prefer-dynamic
#88101
Conversation
r? @jackh726 (rust-highfive has picked a reviewer for you, use r? to override) |
This comment has been minimized.
This comment has been minimized.
83f3499
to
dff2577
Compare
No rest for the wicked. (I got punished for adding type annotations of the form |
This comment has been minimized.
This comment has been minimized.
dff2577
to
0093bb6
Compare
This comment has been minimized.
This comment has been minimized.
Hmm, yeah I was worried about something like this:
i.e. the error output is tightly coupled to the linker's error formatting. I'll go see if I can use one of the old coarser grain error-pattern checking systems. Hopefully those haven't been retired yet. :) |
0093bb6
to
d483cd8
Compare
This comment has been minimized.
This comment has been minimized.
d483cd8
to
411bdf5
Compare
Added reverse map that tells us for a given dependency and its preferred format, what were the parent crates that requested that dependency+format combination.
(type annotations are to help rust-analyzer as much as humans.)
…tes to prefer dynamic linkage on. Note: carrying an empty set here is *not* the same as `-C prefer-dyanmic=no` (and the latter is the default behavior for `rustc`). As discussed in detail in the comments for `enum PreferDynamicSet` (and also in the comments of `rustc_metadata::dependency_format`), `-C prefer-dynamic=no` asks the compiler to guess what linkage to use, and it currently guesses all static first, and if that fails, then it prefers dynamic linkage for all crates that provide both dynamic and static libraries. Using an empty set for `-C prefer-dynamic`, on the other hand, means the compiler should never prefer dynamic linkage over static linkage, even if that means it won't be able to successfully link the build product.
…hen we fail due to repeated static library.
411bdf5
to
d0089c8
Compare
This comment has been minimized.
This comment has been minimized.
One thing still missing from this PR is a run-make test illustrating that the dynamically linked libraries are the set that we expect. The usual pattern for testing that, which you can see in existing run-make tests, is to run the binary successfully, then delete the dylib and attempt to re-run the binary, now expecting a runtime failure. I do plan to add some run-make tests in that vein to this PR; I just wanted to get this up and get some review feedback on it before I invested time on that. |
Okay I'll need to normalize the |
d0089c8
to
3c0b75d
Compare
let link2_parent_names = parent_names(link2); | ||
|
||
let details = match (link2, link) { | ||
(RequireDynamic, RequireStatic) => format!( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💖
} | ||
} | ||
|
||
pub fn contains_crate(&self, crate_name: Symbol) -> bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think using just the crate name is not a good idea. It doesn't allow distinguishing between two different versions of the same library. Ideally it would accept the StableCrateId, but that is not accessible outside of rustc without depending on nightly (-Zls
?) or implementation details. Maybe also allow accepting a dylib path or combination of crate name, if it has bin as crate type in addition to dylib and the list of -Cmetadata
arguments. The StableCrateId
is reproducible using just this information. I don't think we can't change what it depends on as that would change which crates can be linked into the same crate graph and which can't: https://github.com/rust-lang/rust/blob/master/compiler/rustc_span/src/def_id.rs#L148-L175
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would some encoding of crate name and an optionally-supplied version be sufficient to satisfy your criteria here? I'm thinking in the common cases, a single crate name is exactly what an end user will want to write down.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that would work I guess. Cargo could specify everything and a user just the crate name.
Some(s) => { | ||
let v = s.split(',').map(|s| s.to_string()).collect(); | ||
PreferDynamicSet::Set(v) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe it should be a different argument as the new option is not a preference, but mandatory?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm. My intention was that it would still be a preference, in the sense that if you do -C prefer-dynamic=crate1
, but only a static library for crate1
is available, then linkage will still proceed using the static library.
I probably should add a test illustrating that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On reflection, I do think I must have messed up something in my logic that ends up making something a requirement rather than a preference. Specifically, I'm working on a run-make test analogous to https://github.com/rust-lang/rust/blob/master/src/test/run-make-fulldeps/dylib-chain/Makefile
but where I choose variations of whether dylibs on the chain have both rlib and dylib, or just rlib, or just dylib.
The problem I'm seeing is that in some scenarios using -C prefer-dynamic=crate,...
, a crate that is solely provided as a dylib is not getting linked in unless I include it in the above list. That wasn't my intention here: My intention was that the preference is meant to give guidance when the compiler must make a choice, not act as an assertion about what it expects to be present.
Update: I forgot that of course the choice isn't necessarily about which file to link in; you can still statically link a .dylib
file. So even if I generate just a .dylib
for some crate in the chain, the crates linking to it still have a choice to make, and this flag is going to affect that choice.
(Or at least, that's my current interpretation of the behavior I'm observing. Still digging a bit deeper.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update 2: Hmm maybe I was wrong about the above: In general .so
files don't have enough info to be statically linked. But then I'm really confused about some of the behavior I'm seeing. Still looking.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, I have now made a run-make test that I think illustrates the current behavior of the system, including cases where rlib's end up depending on dylibs.
I don't think pre-existing behavior is necessarily intuitive, but my main goal right now is to address missing use cases, not to try to figure out what the ideal behavior is for all cases and potentially inject breaking changes to support that idealized behavior.
|
||
Note that the explicit list of crate names variant of this flag is gated behind | ||
`-Zprefer-dynamic-subset`; as a special case, one can enable the handling of | ||
the single crate `std` as shown in the example above via `-Zprefer-dynamic-std`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Those options don't exist at least as of this commit.
Edit: these are introduced in the next commit.
// skip traversal if we know it cannot ever bear fruit. | ||
false | ||
} else { | ||
tcx.crates(()).iter().any(|cnum| prefer_dynamic.contains_crate(tcx.crate_name(*cnum))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should skip non-linkable crates like proc-macros and their dependencies I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay I'll look into that.
|
||
pub fn check_linked_function_equivalence() { | ||
// Check that the linked functions are the same code by comparing their | ||
// underlying addresses. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wouldn't be surprised if the dynamic linker would resolve each symbol with the same name to the same address at runtime even if it is duplicated between dylibs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Possibly... I have certainly been thinking that the testing methodology here might be flawed. I'm still musing about the best way to write these tests (i.e. to detect unwanted duplication of static libraries across dynamic libraries being loaded).
I'm still planning to invest some effort into a run-make
test, to check the most basic scenario of deleting a dynamic library and observing that this injects a failure to run the binary. But that isn't quite the same check that I'm trying to put in here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could do an objdump of dylibs to check which symbols are defined and imported. Imported symbols use 0 as specified address.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay I added a run-make test; it does two things:
- it uses the pre-existing pattern of building, running to success, then deleting dylibs, and checking that a re-run attempt fails. (I think this testing methodology is fine, as far as it goes.)
- it introduces a new pattern of having methods in the compiled code that differ depending on whether one is linked to the rlib or to the dylib. This is controlled via
cfg
flags. This way, we can directly determine which object code has been linked and loaded when we finally run the program.
This comment has been minimized.
This comment has been minimized.
r? @bjorn3 |
It is worth explicitly noting: The The other five tests, v1-4 and 6, all work without debugflags, and represent pre-existing behavior. I have manually confirmed that those five tests all work as written on the compiler before the changes in this PR. In other words, the test is largely codifying the pre-existing behavior of |
# delimited by single quotes ('') rathern than double quotes ("") so that I can use | ||
# backticks (``) in the notes without it being interpreted as a shell invocation. | ||
# | ||
# `COMMAND && exit 1 && exit 0` is how we run a command and say that we're |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# `COMMAND && exit 1 && exit 0` is how we run a command and say that we're | |
# `COMMAND && exit 1 || exit 0` is how we run a command and say that we're |
I left some comments on the Zulip MCP thread https://rust-lang.zulipchat.com/#narrow/stream/233931-t-compiler.2Fmajor-changes/topic/prefer-dynamic.3Dsubset.20compiler-team.23455 The whole idea is a pretty big hack, that is not much better than the workarounds that exist currently. The main problem is that there's no clean way to refer to indirect dependencies of a crate, because they are identified by hash (the crate name is used to prune the file search space, but it's not an identifier). For crates controlled by cargo (or other build system) the build system can identify the crate by (one of) its file path(s), and pass that path to (I also have comments about the specific implementation in this PR, but I'll leave them for the later time when the general direction is settled.) |
Another idea is to use the fact that So to refer to an indirect dependency we can add a new direct dependency referring to our crate of interest (by path |
Sorry, @petrochenkov , I am not fully understanding this. When you say "to refer to an indirect dependency we can add a new direct dependency referring to our crate of interested", do you mean "refer" as in what internal data structures that Or are you somehow suggesting a different way of invoking the existing |
Maybe, but I don't understand what you mean here.
Certainly not this. By "refer to a crate X" I mean "somehow name that crate from command line". |
☔ The latest upstream changes (presumably #89266) made this pull request unmergeable. Please resolve the merge conflicts. |
-C prefer-dynamic
-C prefer-dynamic
I'm going to revisit my approach here, maybe try to work out something that better leverages things like PR #93901, (assuming that lands). |
see also MCP rust-lang/compiler-team#455
This PR generalizes
-C prefer-dynamic
so that one can specify a specific subset of the crates that should have the preferred-dynamic linkage.Without this PR, one generally has to decide globally whether dynamic linkage will be preferred (via
-C prefer-dynamic=on
, or just-C prefer-dynamic
) or if static linkage will be preferred (via-C prefer-dynamic=off
, or just omitting the option entirely). The fact that this is a global choice means that one can get into tricky scenarios where, because of choices made by your crate dependencies for what output types they generate, your crate hits strange linkage errors. (For one example of this, see #82151.)There are multiple ways to address the above problem, such as changing the crate dependencies to reduce the kinds of output crate types they generate (see e.g. alexkornitzer/hyper@1199048), or choosing
-C prefer-dynamic=no
to stoprustc
from injecting linkage directives, and then adding on additionalRUSTFLAGS=-l<crate_name>
flags to explicitly link to the dynamic libraries you want.This PR suggests a different way to address it: By allowing the user to specify a list of crates,
-C prefer-dynamic=crate1,crate2,...
they believe should be given preferred dynamic linkage.Since
std
comes up as a common example here,-C prefer-dynamic=std
is given special treatment. You can opt into arbitrary lists of crates via-Z prefer-dynamic-subset
. Or you can enable the use of the singleton command line option-C prefer-dynamic=std
via-Z prefer-dynamic-std
.-Z
flags for these cases, is to fast track the stabilization of-C prefer-dynamic=std
, because I think that will provide the biggest impact, and has relatively low risk (in terms of it not having potential for exposing sharp corners of rustc's internal heuristics for determining dynamic linkage). We can let the more general-C prefer-dynamic=crate1,crate2,...
simmer; for example, we may want to make more changes to how chains of rlibs and dylibs are handled, and that may interact with theprefer-dynamic-subset
feature.