-
Notifications
You must be signed in to change notification settings - Fork 59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
When are static
symbols guaranteed to show up in the final binary?
#504
Comments
Thanks, Ralf! Motivation
As far as I am aware, the only way to guarantee that a symbol is present without compiler magic is by performing a Unfortunately, Rust's semantics do not guarantee "
Potential solutionTo solve this problem, I propose to make used-ness a dynamic property of symbols that is either provided a priori by In the motivating example, the |
If the act of executing the read_volatile marks the static as used, that is still incorrect. The tls callback is called both when spawning a thread and when destroying it, but the read_volatile would only get executed after spawning the first thread and thus after the first time the callback should have been called. While it may happen to work here because we only care about the thread exit callback for a destructor, it would not work at all for constructors. |
Yes, indeed! But that seems fine to me. I really cannot come up with an opsem that would allow constructors to work with Still, I'd argue that dynamic used-ness should be used, not just because it'd fix Footnotes
|
I'm not sure any definition of "final binary" is an opsem question. Insofar as it affects constructor execution, C++ (which does have static constructors as a language feature) couldn't do much about linkers and runtime loaders, and just made it "happens-before the first odr-use of anything in the same translation unit". |
Since the goal is manipulating the generated binary, I would think the "correct" way for std to ensure mention, if Also, I don't see why I don't feel like "is a symbol mentioned" is a meaningful question to ask at the AM layer; it's a property of the concrete machine's object format after lowering from the AM. As a low level language we do care about guaranteeing some properties of that lowering, but it might be acceptable if those are a bit less formal than the abstract opsem. To extend Connor's comment, in theory it should be the compiler's responsibility to ensure any magic symbol mentions needed to enable TLS happen, not std's, as running of TLS dtors is a language feature requiring cooperation with the host. (But in fairness std is privileged.) |
This is indeed a point I've made before. But this is something that requires both compiler work and at least some domain knowledge whereas hacks in std are "easy" and they've "worked" for years so there's little motivating it. |
Eh, I don't agree that it's fundamentally not |
Well in this case, llvm does in fact do the right thing. You can use TLS destructors are slightly different. In C/C++ these are provided by the runtime (e.g. msvc) in library files. These have object files compiled in such a way that including the function to register a destructor also includes the thing that runs destructors. But again, std can't really control which functions are "connected" to which other functions. |
AFAIK there is a way to do this, which us
For TLS dtors that may be the best solution. But what if someone needs some other special magic linker symbol in the future and wants it to be "used" in a way that can be optimized, with tricks similar to
That's really the key point. opsem defines what happens in a single execution, and in all possible executions. The binary is an intermediate artifact that opsem has no bearing on (and in fact opsem applies to ways of running Rust code where there is no binary, such as via Miri). So there's a fundamental mismatch here where in practice the symbol will show up if any execution of the program may do the volatile read -- so even if the read happens half-way through execution, the corresponding callback will have been invoked already at program startup, before there was any way to know whether the read will happen! All Miri can do is check whether the current execution has done the read in the past. So when we go look up a linker array, we could say that all statics that had volatile reads in the past are definitely included (though this is a really strange coupling of what should be unrelated concepts) -- but we can't do anything about symbols where the volatile read will happen in the future. If this could be made a compiler primitive, that would indeed make things simpler. But then I worry that this will not be the last time someone wants to play tricks like this, and we can't add a new compiler primitive each time. |
It probably needed |
A more general mechanism would be something like |
This came up in rust-lang/miri#450 but is really a t-opsem question, not a Miri question.
Obviously anything marked
#[used]
is guaranteed to show up (though it seems there are issues that make this not always true, but this seems like a rustc implementation issue to me, the spec is unambiguous).But @joboet wants this to be guaranteed for more cases. I'll let them do the motivation and summary for that.
The text was updated successfully, but these errors were encountered: