Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Runtime code is generated for functions only used in const eval #119214

Open
Kimundi opened this issue Dec 22, 2023 · 27 comments
Open

Runtime code is generated for functions only used in const eval #119214

Kimundi opened this issue Dec 22, 2023 · 27 comments
Labels
C-discussion Category: Discussion or questions that doesn't represent real issues. I-heavy Issue: Problems and improvements with respect to binary size of generated code. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@Kimundi
Copy link
Member

Kimundi commented Dec 22, 2023

Code

I tried this code:

I expected to see this happen: Both check1 and check2 should compile to functions that just use a compiletime-evaluated struct value

Instead, this happened: check1 disapeared from the assembly, and it looks like all the code needed to runtime-evaluate it got added instead

The weird thing is that the function that regressed is the one that uses a const item by name, rather than the one that just calls a const function inside a non-const function body.

Version it worked on

It most recently worked on: 1.74.1 (stable as of writing this)

Version with regression

from playground:

1.76.0-beta.1 (2023-12-21 0e09125c6c3c2fd70d7d)
1.77.0-nightly (2023-12-21 3d0e6bed600c0175628e)
@Kimundi Kimundi added C-bug Category: This is a bug. regression-untriaged Untriaged performance or correctness regression. labels Dec 22, 2023
@rustbot rustbot added I-prioritize Issue: Indicates that prioritization has been requested for this issue. needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. labels Dec 22, 2023
@asquared31415
Copy link
Contributor

Note that adding #[no_mangle] to check1 to force it to appear in the binary does cause it to appear as a constant. I think this might be a case of "in isolation, codegen is weird, you have to evaluate it at a whole program level"?

@RossSmyth
Copy link
Contributor

RossSmyth commented Dec 22, 2023

This looks like another instance of #116505, meaning that depending on some heuristics leaf functions are only codegenned in cgu that need them. Since check1 and check2 are not called, they are not seen. This is good for optimizations but quite unfortunate for inspecting asm on Compiler Explorer.

Adding #[no_mangle] will force them to be available, you can also add -C link-dead-code=yes, or turn of the optimization with -Z cross-crate-inline-threshold=never on nightly.

https://godbolt.org/z/4rfGGrbWx

@saethlin saethlin added C-discussion Category: Discussion or questions that doesn't represent real issues. and removed C-bug Category: This is a bug. I-prioritize Issue: Indicates that prioritization has been requested for this issue. regression-untriaged Untriaged performance or correctness regression. needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. labels Dec 22, 2023
@saethlin saethlin self-assigned this Dec 23, 2023
@saethlin
Copy link
Member

I think there are multiple things happening here. I'll try to explain what I'm seeing.

I expected to see this happen: Both check1 and check2 should compile to functions that just return a compiletime-evaluated struct value

That's not possible, those functions both take runtime parameters. I'm presuming you mean "use a compiletime-evaluated struct value" instead?

check1 disapeared from the assembly

This is the cross-crate-inlining effect; I've been mulling over ideas to improve the situation, but this is not a bug. It's very unfortunate though. compiler-explorer/compiler-explorer#5782

and it looks like all the code needed to runtime-evaluate it got added instead... The weird thing is that the function that regressed is the one that uses a const item by name, rather than the one that just calls a const function inside a non-const function body.

This I think is a bug. I've made a smaller example here: https://godbolt.org/z/GoEa8jc5f and I'm going to look into it...

@saethlin
Copy link
Member

saethlin commented Dec 23, 2023

The reproducer is this:

pub struct Thing;

impl Thing {
    const fn new(panic: bool) -> Self {
        if panic {
            panic!()
        }
        Thing
    }
}

const T: Thing = Thing::new(false);

pub fn oof() -> Thing {
    T
}

Since automatic cross-crate-inlining landed, the assembly generated for this crate contains Thing::new and a bunch of symbols related to panics. Which looks very goofy. At first I was afraid that I broke something in that PR (#116505) related to reachability, but I'm now quite sure I didn't. On an old compiler, if you just add #[inline] to fn oof, you'll get the goofy assembly in godbolt: https://godbolt.org/z/q4E6s4jvs

The change here is that fn oof is cross-crate-inlinable, which means that reachability analysis must consider the contents of its body to be reachable outside this CGU. This is a dark corner of the whole "automatic #[inline]" or "cross-crate-inlinable" approach that I thought I had dodged in the original PR by only inferring cross-crate-inlinability for leaf functions. Turns out I didn't. In this program, fn oof is a leaf function in MIR because it looks like this:

fn oof() -> Thing {
    let mut _0: Thing;

    bb0: {
        _0 = const _;
        return;
    }
}

But to reachability analysis, this function has now made the initializer of that const reachable. Ergo, to reachability analysis this is not a leaf function.

So cross-crate-inlining turns the surface area of this crate from "a symbol called oof that I can call and nothing else" to "a symbol called oof and also its body and everything reachable from it". So the compiler dutifully recurses through the const fn that initializes this const and instantiates everything that it reaches.

So from one perspective, this is sensible. But from the OP's, this is obviously silly. All this extra panic code was made reachable from compile time evaluation, but we stamped out runtime code for it. So I wonder if that's what we should be tracking here; if reachability determines something is reachable but only at compile time.

@saethlin saethlin added T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. I-heavy Issue: Problems and improvements with respect to binary size of generated code. labels Dec 23, 2023
@Kimundi
Copy link
Member Author

Kimundi commented Dec 23, 2023

I expected to see this happen: Both check1 and check2 should compile to functions that just return a compiletime-evaluated struct value

That's not possible, those functions both take runtime parameters. I'm presuming you mean "use a compiletime-evaluated struct value" instead?

Err yes, that was badly formulated on my part. Fixing it in the description.

check1 disapeared from the assembly

This is the cross-crate-inlining effect; I've been mulling over ideas to improve the situation, but this is not a bug. It's very unfortunate though. compiler-explorer/compiler-explorer#5782

That fully explains it, I was not aware that we actually started cross-crate inlining for non-marked functions now. This bugreport was created based on the assumption that that does not happen for public library functions.

and it looks like all the code needed to runtime-evaluate it got added instead... The weird thing is that the function that regressed is the one that uses a const item by name, rather than the one that just calls a const function inside a non-const function body.

This I think is a bug. I've made a smaller example here: https://godbolt.org/z/GoEa8jc5f and I'm going to look into it...

Thanks! Good to know I discovered something unexpected at least 😄

@Kimundi
Copy link
Member Author

Kimundi commented Dec 23, 2023

So I wonder if that's what we should be tracking here; if reachability determines something is reachable but only at compile time.

That seems like what this ought to boil down to, yeah

@saethlin saethlin changed the title Regression: const functions in const items do not seem to be evaluated at compiletime Runtime code is generated for functions only used in const eval Dec 26, 2023
@RalfJung
Copy link
Member

So I wonder if that's what we should be tracking here; if reachability determines something is reachable but only at compile time.

Interesting, I guess currently this is entirely per-item and does not consider "how the item is used"? Since we distinguish compile-time-MIR and runtime MIR, it would indeed make sense to treat those two rather separately for reachability.

Cc @oli-obk

@saethlin saethlin removed their assignment Jan 11, 2024
@saethlin
Copy link
Member

I started working on this and made some progress, but then was immediately tripped up by the difference between a const which captures a function pointer and a const which calls a function. That's probably resolvable, but I have enough other things on my plate right now that I'm unassigning myself to encourage anyone else interested to work on this.

@RalfJung
Copy link
Member

Having recently looked into the collector a bit, I have no idea how this is happening. We're not recursing into the MIR body of a const, so I can't see how the body of T can affect which things get collected.

Maybe some other pass is adding junk to oof's required_const?

@RalfJung
Copy link
Member

RalfJung commented Mar 11, 2024

There are these calls in reachable to visit_nested_body which are suspicious but I don't know if that affects mono item collection at all:

// Reachable constants will be inlined into other crates
// unconditionally, so we need to make sure that their
// contents are also reachable.
hir::ItemKind::Const(_, _, init) | hir::ItemKind::Static(_, _, init) => {
self.visit_nested_body(init);
}

Does anyone know anyone who understands our mono item collector?^^

@saethlin
Copy link
Member

Per my comment above, I believe this is required to handle consts that capture function pointers. Those function pointers will be used at runtime, so they're runtime code reachable through a const. That pattern is used by the standard library's thread-local system, which is a common source of linker errors when I work on MirUsedCollector.

@RalfJung
Copy link
Member

RalfJung commented Mar 12, 2024

That doesn't sound right. Function pointers are handled by walking the final value of the const, and if we see a function pointer in there we add it to the monomorphization list. (Pointers are special values -- they have provenance -- so we can detect them in the final value of a const pretty easily; this does not require a type-driven traversal.) There's the collect_alloc function for that. So this does not require looking at the MIR of the const either.

@oli-obk

This comment was marked as resolved.

@RalfJung

This comment was marked as resolved.

@oli-obk

This comment was marked as resolved.

@RalfJung

This comment was marked as resolved.

@oli-obk

This comment was marked as resolved.

bors added a commit to rust-lang-ci/rust that referenced this issue Mar 12, 2024
Stop walking the bodies of statics for reachability, and evaluate them instead

cc `@saethlin` `@RalfJung`

cc rust-lang#119214

This reuses the `DefIdVisitor` from `rustc_privacy`, because they basically try to do the same thing. It can probably be extended to constants, too, but let's tackle that separately, it's likely more involved.
@RalfJung
Copy link
Member

RalfJung commented Mar 12, 2024 via email

@RalfJung
Copy link
Member

The reproducer is this:

FWIW this doesn't reproduce on latest nightly any more, the #[inline] has to be added again (making current nightly behave like old versions): https://godbolt.org/z/4vxsT4cT3

@RalfJung
Copy link
Member

RalfJung commented Mar 13, 2024

Turns out reachability is involved, I think. Just very indirectly.

The key difference in the log of the good behavior (without #[inline]) vs the bad behavior (with #[inline]) is

│ ├─┐rustc_monomorphize::collector::push_if_root def_id=DefId(0:6 ~ collect[a406]::{impl#0}::new)
│ │ ├─  0ms DEBUG rustc_monomorphize::collector found root
│ │ ├─┐rustc_monomorphize::collector::create_fn_mono_item instance=Instance { def: Item(DefId(0:6 ~ collect[a406]::{impl#0}::new)), args: [] }, source=no-location (#0)
│ │ │ ├─  0ms DEBUG rustc_monomorphize::collector return=Spanned { node: Fn(Instance { def: Item(DefId(0:6 ~ collect[a406]::{impl#0}::new)), args: [] }), span: no-location (#0) }
│ │ ├─┘
│ ├─┘

IOW, this function suddenly considers Thing::new to be a root:

fn is_root(&self, def_id: LocalDefId) -> bool {
!self.tcx.generics_of(def_id).requires_monomorphization(self.tcx)
&& match self.mode {
MonoItemCollectionMode::Eager => true,
MonoItemCollectionMode::Lazy => {
self.entry_fn.and_then(|(id, _)| id.as_local()) == Some(def_id)
|| self.tcx.is_reachable_non_generic(def_id)
|| self
.tcx
.codegen_fn_attrs(def_id)
.flags
.contains(CodegenFnAttrFlags::RUSTC_STD_INTERNAL_SYMBOL)
}
}
}

So is_reachable_non_generic switches from false to true. That boils down to a membership test on the set computed here, which is created by starting with the reachable_set computed here and then removing some stuff. reachable_set in turn recurses into inline functions here and into the MIR body of constants here.

So... I guess the question is, is reachable_set supposed to return only things that are runtime-reachable (in which case it is currently returning too much), or is it supposed to return everything that reachable in any fashion (in which case the collector using this as its criterion is wrong since there we only want to collect runtime-relevant things).

bors added a commit to rust-lang-ci/rust that referenced this issue Mar 16, 2024
Stop walking the bodies of statics for reachability, and evaluate them instead

cc `@saethlin` `@RalfJung`

cc rust-lang#119214

This reuses the `DefIdVisitor` from `rustc_privacy`, because they basically try to do the same thing.

This PR's changes can probably be extended to constants, too, but let's tackle that separately, it's likely more involved.
@RalfJung
Copy link
Member

RalfJung commented May 8, 2024

So from one perspective, this is sensible. But from the OP's, this is obviously silly. All this extra panic code was made reachable from compile time evaluation, but we stamped out runtime code for it. So I wonder if that's what we should be tracking here; if reachability determines something is reachable but only at compile time.

So, looking at the comments I wrote in #122769...

I think what happens is that since T is reachable (which is correct), it may be the case that T returns a function pointer to any function it mentions, recursively. So just in case T returns a function pointer to Thing::new, we mark Thing::new reachable.

Now, we could probably skip this for top-level const items, since we know we can always evaluate them to a value, and then we can just check the final value for any function pointers and only codegen those. (And the collector already has logic to find these function pointers.)

But for associated consts, this is a lot harder. I think we'd have to keep track (inside reachable computation) of whether the current item is const-reachable or full-reachable, and then if we are const-reachable that does not in general get added to the regular reachable set, until we find a function item mentioned in a way that it does not immediately get called -- those are then added as regular-reachable again. Then there are still examples where we'd generate runtime code for no reason but it'd be a lot less common. Fundamentally we can't fully avoid false positives here as we can't be sure which functions may be referenced via function pointers that actually leak into the final value of the const (and thus potentially into runtime code)

@saethlin
Copy link
Member

saethlin commented May 8, 2024

we'd have to keep track (inside reachable computation) of whether the current item is const-reachable or full-reachable, and then if we are const-reachable that does not in general get added to the regular reachable set, until we find a function item mentioned in a way that it does not immediately get called -- those are then added as regular-reachable again.

Ah! I think this might explain why my attempt to implement const-reachable didn't work.

@RalfJung
Copy link
Member

RalfJung commented May 8, 2024 via email

@RalfJung
Copy link
Member

RalfJung commented May 8, 2024

When a runtime function calls a private const fn, we don't have to consider that function to be const-reachable. So the naive approach outlined above could overestimate how much is const-reachable. The two reachable sets do interact though... I think it's a fixed point of a computation that goes like

  • All public functions and statics are runtime-reachable.
  • All public const fn and const are const-reachable.
  • If a runtime-reachable function that needs cross-crate MIR mentions some function or static, that other item is also runtime-reachable.
  • If a runtime-reachable function that needs cross-crate MIR mentions a const, that item is const-reachable.
  • If a const-reachable item mentions an item, that item is const-reachable.
  • If a const-reachable item mentions a function/static in a way that the function item itself (or a function pointer computed from it) could escape the current item, then that function/static is runtime-reachable.

@saethlin
Copy link
Member

saethlin commented May 9, 2024

Do you still have that branch somewhere?

Yes, though I expect there isn't anything in it worth salvaging: master...saethlin:rust:const-reachability

@RalfJung
Copy link
Member

RalfJung commented May 9, 2024

Okay so that computed two sets as result from the query. I don't know if that's necessary.

Regarding the analysis outlined above, I assume it is pretty rare that a private function is const for no reason, so it's probably okay to simplify this a bit:

  • All public items are runtime-reachable.
  • If a runtime-reachable function that needs cross-crate MIR mentions some item, that other item is also runtime-reachable.
  • If a runtime-reachable item is a const or const fn, it is const-reachable.
  • If a const-reachable item mentions an item, that item is const-reachable.
  • If a const-reachable item mentions a function/static in a way that the function item itself (or a function pointer computed from it) could escape the current item, then that function/static is runtime-reachable.

jieyouxu added a commit to jieyouxu/rust that referenced this issue May 11, 2024
reachable computation: extend explanation of what this does, and why

Follow-up to rust-lang#122769. I had the time to think about this some more, in particular in the context of rust-lang#119214, so I felt it was worth extending these comments some more.

I also gave up on the context of "externally reachable" as it is not called that way anywhere else in the compiler.

Cc `@tmiasko` `@saethlin`
rust-timer added a commit to rust-lang-ci/rust that referenced this issue May 11, 2024
Rollup merge of rust-lang#124904 - RalfJung:reachable, r=tmiasko

reachable computation: extend explanation of what this does, and why

Follow-up to rust-lang#122769. I had the time to think about this some more, in particular in the context of rust-lang#119214, so I felt it was worth extending these comments some more.

I also gave up on the context of "externally reachable" as it is not called that way anywhere else in the compiler.

Cc `@tmiasko` `@saethlin`
flip1995 pushed a commit to flip1995/rust-clippy that referenced this issue May 24, 2024
reachable computation: extend explanation of what this does, and why

Follow-up to rust-lang/rust#122769. I had the time to think about this some more, in particular in the context of rust-lang/rust#119214, so I felt it was worth extending these comments some more.

I also gave up on the context of "externally reachable" as it is not called that way anywhere else in the compiler.

Cc `@tmiasko` `@saethlin`
bors added a commit to rust-lang-ci/rust that referenced this issue Jun 6, 2024
Don't walk the bodies of free constants for reachability.

follow-up to rust-lang#122371

cc rust-lang#119214

This avoids codegening items (e.g. functions) that are only used during const eval, but do not reach their final constant value (e.g. via function pointers).

r? `@tmiasko`
@RalfJung
Copy link
Member

RalfJung commented Jun 11, 2024

On nightly, godbolt doesn't show any assembly any more with the original reproducer, thanks to #122505.

However, this variant still generates assembly:

pub struct Thing;

impl Thing {
    const fn new(panic: bool) -> Self {
        if panic {
            panic!()
        }
        Thing
    }

    const C: Thing = Thing::new(false);
}

#[inline]
pub fn oof() -> Thing {
    Thing::C
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-discussion Category: Discussion or questions that doesn't represent real issues. I-heavy Issue: Problems and improvements with respect to binary size of generated code. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

7 participants