Increase the precision of the exportable MIR check #112511

saethlin · 2023-06-10T23:45:05Z

Before this PR, MIR inlining only inlines across crates or within a crate if the callee is a constructor, #[inline], or generic. This misses some inlining opportunities within a crate; trivial helper functions are not always given #[inline] (and maybe that's a good thing).

The reason we were so conservative before is that the MIR inliner needs to not drag MIR which mentions items that we did not generate MIR for, or items that we have given internal linkage, into bodies that we must generate MIR for. If we get this heuristic wrong, we see ICEs (for promoted consts without MIR) or linker errors (for private functions and statics now mentioned in exported MIR). So this PR adds more fine-grained analysis to detect callee bodies that contain MIR we must not export.

Perhaps even better, this PR removes any prohibition on inlining not-exported bodies into other not-exported bodies.

saethlin · 2023-06-10T23:45:22Z

@bors try @rust-timer queue

bors · 2023-06-10T23:45:31Z

⌛ Trying commit 06f0db5a78133b5791bff0a1aaf4d40adedf3577 with merge 7b33b2e034f6752ed454835d2b617f567fda33b6...

bors · 2023-06-11T00:02:59Z

💔 Test failed - checks-actions

saethlin · 2023-06-11T03:32:37Z

@bors try @rust-timer queue

bors · 2023-06-11T03:32:45Z

⌛ Trying commit 0314140fbbb7d2ca96f10b98e46fe31a4b90c87d with merge 74dba061b486ec4270e8845fbaa787be765c8c7b...

bors · 2023-06-11T04:46:33Z

💔 Test failed - checks-actions

saethlin · 2023-06-11T17:25:10Z

@bors try @rust-timer queue

bors · 2023-06-11T17:25:19Z

⌛ Trying commit 622821a with merge a279a33...

…=<try> Increase the precision of the exportable MIR check r? `@ghost`

bors · 2023-06-11T18:42:36Z

☀️ Try build successful - checks-actions
Build commit: a279a33 (a279a3308232c507303c35e9c8dae736c588c8c0)

saethlin · 2023-06-20T22:04:39Z

I think all the regressions are because this PR enables MIR inlining in all debug builds. All the regressions already produce the same error or ICE with cargo +nightly test --release.

Which now come to think of it is what I should have asked crater to do, but I was afraid of the run taking too long. That train has sailed.

saethlin · 2023-06-21T21:45:02Z

@bors try @rust-timer queue

bors · 2023-06-21T21:45:11Z

⌛ Trying commit 433b1bb with merge aca4559513bf32802e55bdc4f193dff6cdb1e19b...

bors · 2023-06-21T23:19:48Z

☀️ Try build successful - checks-actions
Build commit: aca4559513bf32802e55bdc4f193dff6cdb1e19b (aca4559513bf32802e55bdc4f193dff6cdb1e19b)

bors · 2023-06-21T23:19:48Z

☀️ Try build successful - checks-actions
Build commit: aca4559513bf32802e55bdc4f193dff6cdb1e19b (aca4559513bf32802e55bdc4f193dff6cdb1e19b)

rust-timer · 2023-06-22T01:26:24Z

Finished benchmarking commit (aca4559513bf32802e55bdc4f193dff6cdb1e19b): comparison URL.

Overall result: ✅ improvements - no action needed

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

@bors rollup=never
@rustbot label: -S-waiting-on-perf -perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	1.2%	[0.6%, 1.6%]	3
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-0.5%	[-1.7%, -0.2%]	29
Improvements ✅ (secondary)	-0.5%	[-0.9%, -0.2%]	34
All ❌✅ (primary)	-0.4%	[-1.7%, 1.6%]	32

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	2.6%	[2.6%, 2.6%]	1
Regressions ❌ (secondary)	0.7%	[0.7%, 0.7%]	1
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	2.6%	[2.6%, 2.6%]	1

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	1.5%	[1.5%, 1.5%]	1
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-1.4%	[-1.5%, -1.3%]	2
Improvements ✅ (secondary)	-3.9%	[-3.9%, -3.9%]	1
All ❌✅ (primary)	-0.4%	[-1.5%, 1.5%]	3

Binary size

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	0.2%	[0.0%, 1.0%]	33
Regressions ❌ (secondary)	0.3%	[0.0%, 0.6%]	11
Improvements ✅ (primary)	-0.2%	[-0.8%, -0.0%]	24
Improvements ✅ (secondary)	-0.2%	[-0.7%, -0.0%]	4
All ❌✅ (primary)	0.0%	[-0.8%, 1.0%]	57

Bootstrap: 659.628s -> 665.008s (0.82%)

rustbot · 2023-06-22T02:16:53Z

Some changes occurred to MIR optimizations

cc @rust-lang/wg-mir-opt

cjgillot · 2023-06-22T15:18:14Z

IIUC, the goal of this PR is to encore promoted_mir, in case we inline a body which references a promoted constant. Instead of relying on a global inlined_internal_defs, can this be performed by walking required_consts in the metadata encoder?

tmiasko · 2023-06-22T17:06:36Z

Can you describe why would it ever be necessary to encode additional promoteds and give an example? The tricky_mir.rs looks like a test case for this situation, but no inlining happens in get_promoted_const. It also seems suspect since the code that identifies references to internal items doesn't special case promoteds. If references to promoteds slip through the check, then the implementation must be incorrect.

Also, please take a look at MirUsedCollector. For example, the implementation here is missing references introduced through ThreadLocalRef and SymStatic. Handling of constants also seems incomplete in comparison to what happens in collect_const_value and collect_miri. If the implementation here can be simpler for some reason, then it deserves an explanatory comment why.

bors · 2023-06-28T01:03:58Z

☔ The latest upstream changes (presumably #113105) made this pull request unmergeable. Please resolve the merge conflicts.

saethlin · 2023-07-05T01:20:03Z

@tmiasko I think the problem you're trying to point me at is with the implementation of reachable_set here: https://github.com/rust-lang/rust/blob/f20afcc455bbcc5c0f7679450fb35fd0c9668880/compiler/rustc_passes/src/reachable.rs

Because MirUsedCollector calls is_reachable_non_generic:

rust/compiler/rustc_monomorphize/src/collector.rs

Line 949 in f20afcc

if tcx.is_reachable_non_generic(def_id)

which is in turn based on reachable_set:

rust/compiler/rustc_codegen_ssa/src/back/symbol_export.rs

Lines 60 to 61 in f20afcc

    
           let mut reachable_non_generics: DefIdMap<_> = tcx 
        
               .reachable_set(())

So I'm trying to reimplement reachable_set based on optimized MIR, and while I'm at it, I'm trying to remove this thing that looks to me like a hack (based on the fact that it has a FIXME) but might also not be (based on the fact that as far as I can tell, the suggested fix doesn't work):

rust/compiler/rustc_passes/src/reachable.rs

Lines 396 to 409 in f20afcc

    
           { 
        
               // Some methods from non-exported (completely private) trait impls still have to be 
        
               // reachable if they are called from inlinable code. Generally, it's not known until 
        
               // monomorphization if a specific trait impl item can be reachable or not. So, we 
        
               // conservatively mark all of them as reachable. 
        
               // FIXME: One possible strategy for pruning the reachable set is to avoid marking impl 
        
               // items of non-exported traits (or maybe all local traits?) unless their respective 
        
               // trait items are used from inlinable code through method call syntax or UFCS, or their 
        
               // trait is a lang item. 
        
               let crate_items = tcx.hir_crate_items(()); 
        
               for id in crate_items.items() { 
        
                   check_item(tcx, id, &mut reachable_context.worklist, effective_visibilities); 
        
               }

I'm not asking for help, just offering up my state so people know this is still in-progress, and in case something I'm saying seems quite wrong. You and others here know much more about the compiler than I do.

tmiasko · 2023-07-05T09:43:28Z

The approach suggested in the fixme regarding reachability for trait impl methods seems plausible in principle. Vtable methods are also reachable from a code that constructs vtable even if respective methods are never called. Hard to say whether this list is exhaustive.

JohnCSimon · 2023-10-01T03:05:08Z

@saethlin

ping from triage - can you post your status on this PR? There hasn't been an update in a few months. Thanks!

saethlin · 2023-10-01T03:26:34Z

Yes.

I'm closing this PR because the work that's described in these PR comments is actually in a different branch of mine. If I make progress on this whole concept, it will be by landing that branch first, then re-pursuing this idea.

saethlin added A-mir-opt Area: MIR optimizations A-mir-opt-inlining Area: MIR inlining labels Jun 10, 2023

rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Jun 10, 2023