Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

-C inline-threshold has no effect with new LLVM pass manager #89742

Closed
MSxDOS opened this issue Oct 10, 2021 · 13 comments · Fixed by #124712
Closed

-C inline-threshold has no effect with new LLVM pass manager #89742

MSxDOS opened this issue Oct 10, 2021 · 13 comments · Fixed by #124712
Labels
A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. C-bug Category: This is a bug. E-easy Call for participation: Easy difficulty. Experience needed to fix: Not much. Good first issue. E-help-wanted Call for participation: Help is requested to fix this issue. P-low Low priority T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@MSxDOS
Copy link

MSxDOS commented Oct 10, 2021

The compiler produces identical binaries whatever value is set.

In this example:

const LEN: usize = 20;

pub fn test(vals1: &[usize; LEN], vals2: &[usize; LEN]) -> bool {
    test_inner(vals1) || test_inner(vals2)
}

fn test_inner(vals: &[usize; LEN]) -> bool {
    vals.iter().any(|v| v % 2 == 0)
}

test_inner is not inlined when compiled with -C opt-level=3 -C inline-threshold=9000 unless the new pass manager is manually disabled via -Z new-llvm-pass-manager=no.

https://rust.godbolt.org/z/TKGoz3h6n

Related: #61088

@MSxDOS MSxDOS added C-bug Category: This is a bug. regression-untriaged Untriaged performance or correctness regression. labels Oct 10, 2021
@rustbot rustbot added I-prioritize Issue: Indicates that prioritization has been requested for this issue. A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. regression-from-stable-to-nightly Performance or correctness regression from stable to nightly. and removed regression-untriaged Untriaged performance or correctness regression. labels Oct 10, 2021
@MSxDOS
Copy link
Author

MSxDOS commented Oct 10, 2021

cc @nikic

@nikic
Copy link
Contributor

nikic commented Oct 10, 2021

Does this flag have an any actual usage?

I don't think NewPM supports it directly, but we could map it to -Cllvm-args=-inline-threshold=.

@MSxDOS
Copy link
Author

MSxDOS commented Oct 10, 2021

Not sure what you mean. This is a stable compiler flag and it's documented, so it should either work as intended or be removed.

-Cllvm-args=-inline-threshold is not the same though. They produce different binaries with same values passed to them, e.g. with old pass manager and -C opt-level=3, -C inline-threshold=275 produces the same binary as with no flag, matching the docs, but -Cllvm-args=-inline-threshold doesn't.

On a side note, I noticed the new pass manager doing weird things when it comes to function inlining in general in some occasions:

  • It doesn't inline medium sized functions that are called once.
  • It doesn't inline very small private functions from external crates if their public caller is marked #[inline] but does when it's not. The caller itself is also small and always gets inlined.
  • A medium sized function that's called in two places gets inlined in one but not in another, while the old pass manager inlines both.

This is when compiled with 1 codegen unit and any LTO. I wonder if it's related to this issue.

@nikic
Copy link
Contributor

nikic commented Oct 10, 2021

Not sure what you mean. This is a stable compiler flag and it's documented, so it should either work as intended or be removed.

Yes, my question is whether this flag should be removed (read: deprecated and ignored). This seems like rather dubious functionality to expose under a first-class -C flag, and I think we're better off making people use -Cllvm-args if they actually know what they're doing. LLVM doesn't have a single inlining threshold, it has something like 5 different ones that apply depending on the specific case, plus another 15 or so other parameters controlling inlining behavior.

-Cllvm-args=-inline-threshold is not the same though. They produce different binaries with same values passed to them, e.g. with old pass manager and -C opt-level=3, -C inline-threshold=275 produces the same binary as with no flag, matching the docs, but -Cllvm-args=-inline-threshold doesn't.

Probably the closer analogue to the old -C inline-threshold interpretation would be -Cllvm-args=-inlinedefault-threshold=, though it's arguable whether that is really the behavior users expect for this flag.

On a side note, I noticed the new pass manager doing weird things when it comes to function inlining in general in some occasions:

  • It doesn't inline medium sized functions that are called once.
  • It doesn't inline very small private functions from external crates if their public caller is marked #[inline] but does when it's not. The caller itself is also small and always gets inlined.
  • A medium sized function that's called in two places gets inlined in one but not in another, while the old pass manager inlines both.

This is when compiled with 1 codegen unit and any LTO. I wonder if it's related to this issue.

I'd suggest filing a separate issue for that, with examples. I don't think this is really related to -C inline-threshold.

@workingjubilee
Copy link
Member

I don't think a backend-invariant form of reasoning about this compiler flag is possible, least of all with the somewhat arbitrary numeric API offered here, so I suspect if asked, other BE maintainers will be in favor of deprecation, but I am not sure. To that end:
cc @bjorn3 @antoyo

@antoyo
Copy link
Contributor

antoyo commented Oct 11, 2021

I'm not sure there's an equivalent for this flag in gcc. There's -finline-limit, but it seems to do something different.
Thus, unless I'm wrong, it might make sense to deprecate this flag.

@apiraino apiraino added the T-lang Relevant to the language team, which will review and decide on the PR/issue. label Oct 14, 2021
@hudson-ayers
Copy link
Contributor

For gcc, several --param options could probably be combined to approximate this (e.g. inline-min-speedup, inline-unit-growth inline-heuristics-hint-percent, several others), but I agree there is no direct pass-through equivalent.

That said, this option going from working to doing nothing is user unfriendly -- at the very least, we should be printing a warning that users with a need to tune inlining should investigate LLVM specific flags.

My team noticed a binary size increase of > 3% just from updating Rust, which was caused because this flag silently stopped doing anything. At first, we assumed that Rust had just gotten worse, and decided not to update to a newer nightly until the regression was fixed (we work in the embedded space, where a 3% increase exhausted our available ROM). After awhile it became clear it was not getting better, so I dug deeper and found this. Using -Cllvm-args=-inline-threshold as an approximation of the old behavior got nearly all of the size back.

@insilications
Copy link

The NewPM needs more love.

@workingjubilee workingjubilee added P-low Low priority E-easy Call for participation: Easy difficulty. Experience needed to fix: Not much. Good first issue. E-help-wanted Call for participation: Help is requested to fix this issue. and removed I-prioritize Issue: Indicates that prioritization has been requested for this issue. P-low Low priority labels Jul 6, 2022
@workingjubilee
Copy link
Member

workingjubilee commented Jul 6, 2022

The cited issues regarding binary weight or speed are important, but this only ever controlled a purely advisory instruction, unfortunately, so in a sense these results are "not broken". LLVM always was deciding to multiply the advisory impact of this on its own inlining heuristics by some value, possibly 1, and now LLVM has decided to multiply this directive by 0 instead.

It also seems that the future of rustc is probably one that does not have a natural translation for -Cinline-threshold at all.

The concerns regarding binary weight and speed are important, but there's no need to error on usage of this. We could start issuing a warning, yes, that it currently doesn't seem to do anything. It's not clear what a final future direction for this flag should be, but that warning, too, would be... well, purely advisory. And is probably fairly easy to implement with a bit of mucking about in compiler/rustc_session or somewhere around there.

T-lang might want to discuss this at their next convenience.

@workingjubilee workingjubilee added I-lang-nominated Nominated for discussion during a lang team meeting. and removed regression-from-stable-to-nightly Performance or correctness regression from stable to nightly. labels Jul 6, 2022
@joshtriplett
Copy link
Member

We discussed this in today's @rust-lang/lang meeting. We feel that anything other than inline(always) makes no guarantees at all, and any kind of hints like this command-line option are outside lang's purview; this is entirely a question of compiler optimization.

@joshtriplett joshtriplett added T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. and removed T-lang Relevant to the language team, which will review and decide on the PR/issue. I-lang-nominated Nominated for discussion during a lang team meeting. labels Jul 12, 2022
@CastilloDel
Copy link
Contributor

This is marked as help wanted, but it doesn't seem clear what the path to solve it is. Is there already a decision on whether to add a warning or deprecate the option?

@nikic
Copy link
Contributor

nikic commented Oct 3, 2022

I think those are effectively the same thing, just a question of the precise wording for the warning. Support for the legacy pass manager has been removed recently, so -C inline-threshold is now completely non-functional, under all circumstances. The message at this point could probably be something along the lines of:

warning: -C inline-threshold is no longer supported

possibly with a

use backend-specific options such as -C llvm-args=-inline-threshold=N instead

@bjorn3
Copy link
Member

bjorn3 commented Oct 3, 2022

Would it make sense to translate -Cinline-threshold= to -C llvm-args=-inline-threshold=?

@bors bors closed this as completed in faa28be Jun 24, 2024
rust-timer added a commit to rust-lang-ci/rust that referenced this issue Jun 24, 2024
Rollup merge of rust-lang#124712 - Enselic:deprecate-inline-threshold, r=pnkfelix

Deprecate no-op codegen option `-Cinline-threshold=...`

This deprecates `-Cinline-threshold` since using it has no effect. This has been the case since the new LLVM pass manager started being used, more than 2 years ago.

Recommend using `-Cllvm-args=--inline-threshold=...` instead.

Closes rust-lang#89742 which is E-help-wanted.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. C-bug Category: This is a bug. E-easy Call for participation: Easy difficulty. Experience needed to fix: Not much. Good first issue. E-help-wanted Call for participation: Help is requested to fix this issue. P-low Low priority T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging a pull request may close this issue.