Resurrect #70477: "Use the niche optimization if other variant are small enough" #75866

erikdesjardins · 2020-08-24T03:13:11Z

erikdesjardins · 2020-08-24T03:48:16Z

This needs a perf run, since results in the original PR were negative.

src/test/ui/print_type_sizes/padding.stdout

ogoffart · 2020-08-24T16:36:34Z

Last time i had a look at this issue, the benchmark showed performances regressions.
As explained here: #70477 (comment) , my hypotheses is that this change caused a runtime regression because the generated MIR code for match on enum with niche was not as optimal as it could be.

bors · 2020-08-30T18:02:51Z

☔ The latest upstream changes (presumably #74862) made this pull request unmergeable. Please resolve the merge conflicts.

Dylan-DPC-zz · 2020-09-25T19:48:57Z

let's do a perf run first:

@bors try @rust-timer queue

rust-timer · 2020-09-25T19:48:58Z

Awaiting bors try build completion

bors · 2020-09-25T19:49:11Z

⌛ Trying commit ef282bb1cd28a6a4ce0b8dda66dc4a3250930a77 with merge d85bea00451112d262ee8d42afe2819e1153be8e...

bors · 2020-09-25T20:34:03Z

☀️ Try build successful - checks-actions, checks-azure
Build commit: d85bea00451112d262ee8d42afe2819e1153be8e (d85bea00451112d262ee8d42afe2819e1153be8e)

rust-timer · 2020-09-25T20:34:05Z

Queued d85bea00451112d262ee8d42afe2819e1153be8e with parent 10ef7f9, future comparison URL.

rust-timer · 2020-09-25T23:45:10Z

Finished benchmarking try commit (d85bea00451112d262ee8d42afe2819e1153be8e): comparison url.

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. Please note that if the perf results are neutral, you should likely undo the rollup=never given below by specifying rollup- to bors.

Importantly, though, if the results of this run are non-neutral do not roll this PR up -- it will mask other regressions or improvements in the roll up.

@bors rollup=never

* Put the largest niche first * Add test from issue 63866 * Add test for enum of sized/unsized * Prefer fields that are already earlier in the struct

…ring

erikdesjardins · 2020-09-27T00:45:57Z

Perf is about the same as the original PR, regressions up to 4-5% on a lot of benchmarks.

erikdesjardins · 2020-09-27T00:47:20Z

Pushed another commit to disable the optimization and just reorder fields (tests will fail but the build should pass), let's do another perf run to see if that's the problem.

Dylan-DPC-zz · 2020-10-01T00:20:50Z

@bors try @rust-timer queue

rust-timer · 2020-10-01T00:20:51Z

Awaiting bors try build completion

bors · 2020-10-01T00:21:08Z

⌛ Trying commit d5ef4b6 with merge 4cdfe129f8f8acf6532f8252997ece91030dd135...

bors · 2020-10-01T01:04:07Z

☀️ Try build successful - checks-actions, checks-azure
Build commit: 4cdfe129f8f8acf6532f8252997ece91030dd135 (4cdfe129f8f8acf6532f8252997ece91030dd135)

rust-timer · 2020-10-01T01:04:09Z

Queued 4cdfe129f8f8acf6532f8252997ece91030dd135 with parent ef663a8, future comparison URL.

rust-timer · 2020-10-01T06:04:22Z

Finished benchmarking try commit (4cdfe129f8f8acf6532f8252997ece91030dd135): comparison url.

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. Please note that if the perf results are neutral, you should likely undo the rollup=never given below by specifying rollup- to bors.

Importantly, though, if the results of this run are non-neutral do not roll this PR up -- it will mask other regressions or improvements in the roll up.

@bors rollup=never

ogoffart · 2020-10-02T21:33:22Z

The result seems to indicate that the problem is not caused by the re-ordering if the field and the change in the padding.
This seems to confirm what i wrote #70477 (comment) that the problem is because match when using the niche optimization is slower. It could either be slower for LLVM to compile, or the resulting code might be slower as well as there are more branches.

Either way, we could verify this hypotheses if we try to disable completely the niche optimization optimization (or only for type bigger than 8 bytes or so) and see if that improve the compile time performance.

I guess the fix would be to improve the code generation for the match.
I foresee a quite big improvement (even without the niche optimization) if doing what i try to explain in #70477 (comment)

In the perform_test function, we should not create a TeminatorKind::SwitchInt over Rvalue::Discriminant because the getting the discriminant will do a few branching and subtractions, before doing the next switch.

rust/compiler/rustc_mir_build/src/build/matches/test.rs

Lines 202 to 213 in be38081

    
           self.cfg.push_assign(block, source_info, discr, Rvalue::Discriminant(place)); 
        
           assert_eq!(values.len() + 1, targets.len()); 
        
           self.cfg.terminate( 
        
               block, 
        
               source_info, 
        
               TerminatorKind::SwitchInt { 
        
                   discr: Operand::Move(discr), 
        
                   switch_ty: discr_ty, 
        
                   values: From::from(values), 
        
                   targets, 
        
               }, 
        
           );

Instead, there should probably be a new TerminatorKind::Switch which maps the TestKind::Switch and keep the information about the variant, So codegen has more information to produce optimal code.

Currently, codegen generates something that looks like this (pseudo_code):

// this condition produces quite some instruction that llvm then need to try optimize. 
let discriminent = if (niche in [begin...end]) { niche - offset } else { data_variant };
match(discriminent) {
   data_variant => ...,
   other_variant1 => ...
   other_variant2 => ...
}

Instead, the codegen should do this:

match(niche) {
   other_variant1+offset => ...
   other_variant2+offset => ...
    _ => ...   // (for the data_variant)
   // begin..end  =>  ...    // alternative if the switch is not exhaustive
}

ogoffart · 2020-10-13T13:09:35Z

#77816 did a performence test to disable the niche optimization, showing small performence improvements. So I would conclude that the problem is indeed that the code generated for the niche enum is not optimal, and we should improve it before enabling the niche optimization in more cases.

Dylan-DPC-zz · 2020-12-08T10:31:32Z

@erikdesjardins any updates? (on the CI failure)

erikdesjardins · 2020-12-10T02:44:25Z

This can't be merged in its current state anyways, it's too much of a perf regression. It's likely that a codegen change (as described in #75866 (comment)) will be necessary in order to land this. I won't have time to look into that personally in the near future, so I'll close this for now.

rust-highfive assigned eddyb Aug 24, 2020

rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Aug 24, 2020

pickfire reviewed Aug 24, 2020

View reviewed changes

src/test/ui/print_type_sizes/padding.stdout Show resolved Hide resolved

erikdesjardins force-pushed the niche branch from 708e544 to ef282bb Compare August 30, 2020 18:59

jyn514 added T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. A-layout Area: Memory layout of types labels Sep 8, 2020

crlf0710 added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Sep 25, 2020

ogoffart and others added 2 commits September 26, 2020 20:02

Use the niche optimisation if other enum variants are small enough

ad758f8

* Put the largest niche first * Add test from issue 63866 * Add test for enum of sized/unsized * Prefer fields that are already earlier in the struct

TEMP FOR PERF: remove the optimization, leaving only the field reorde…

d5ef4b6

…ring

erikdesjardins force-pushed the niche branch from ef282bb to ad758f8 Compare September 27, 2020 00:45

ogoffart mentioned this pull request Oct 11, 2020

[DO NOT MERGE] Experiment: disable the niche optimization for most enums #77816

Closed

camelid added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Oct 30, 2020

camelid added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Nov 20, 2020

Dylan-DPC-zz added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Dec 8, 2020

erikdesjardins closed this Dec 10, 2020

ogoffart mentioned this pull request Dec 13, 2021

RFC: Alignment niches for references types. rust-lang/rfcs#3204

Open

erikdesjardins mentioned this pull request Feb 12, 2022

No niche optimization for enum {One(Enum), Two(Enum, Enum)} #93739

Closed

ogoffart mentioned this pull request Feb 18, 2022

Use niche-filling optimization even when multiple variants have data. #94075

Merged

erikdesjardins deleted the niche branch September 9, 2022 18:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Resurrect #70477: "Use the niche optimization if other variant are small enough" #75866

Resurrect #70477: "Use the niche optimization if other variant are small enough" #75866

erikdesjardins commented Aug 24, 2020

erikdesjardins commented Aug 24, 2020

ogoffart commented Aug 24, 2020

bors commented Aug 30, 2020

Dylan-DPC-zz commented Sep 25, 2020

rust-timer commented Sep 25, 2020

bors commented Sep 25, 2020

bors commented Sep 25, 2020

rust-timer commented Sep 25, 2020

rust-timer commented Sep 25, 2020

erikdesjardins commented Sep 27, 2020 •

edited

Loading

erikdesjardins commented Sep 27, 2020 •

edited

Loading

Dylan-DPC-zz commented Oct 1, 2020

rust-timer commented Oct 1, 2020

bors commented Oct 1, 2020

bors commented Oct 1, 2020

rust-timer commented Oct 1, 2020

rust-timer commented Oct 1, 2020

ogoffart commented Oct 2, 2020

ogoffart commented Oct 13, 2020

Dylan-DPC-zz commented Dec 8, 2020 •

edited

Loading

erikdesjardins commented Dec 10, 2020

Resurrect #70477: "Use the niche optimization if other variant are small enough" #75866

Resurrect #70477: "Use the niche optimization if other variant are small enough" #75866

Conversation

erikdesjardins commented Aug 24, 2020

erikdesjardins commented Aug 24, 2020

ogoffart commented Aug 24, 2020

bors commented Aug 30, 2020

Dylan-DPC-zz commented Sep 25, 2020

rust-timer commented Sep 25, 2020

bors commented Sep 25, 2020

bors commented Sep 25, 2020

rust-timer commented Sep 25, 2020

rust-timer commented Sep 25, 2020

erikdesjardins commented Sep 27, 2020 • edited Loading

erikdesjardins commented Sep 27, 2020 • edited Loading

Dylan-DPC-zz commented Oct 1, 2020

rust-timer commented Oct 1, 2020

bors commented Oct 1, 2020

bors commented Oct 1, 2020

rust-timer commented Oct 1, 2020

rust-timer commented Oct 1, 2020

ogoffart commented Oct 2, 2020

ogoffart commented Oct 13, 2020

Dylan-DPC-zz commented Dec 8, 2020 • edited Loading

erikdesjardins commented Dec 10, 2020

erikdesjardins commented Sep 27, 2020 •

edited

Loading

erikdesjardins commented Sep 27, 2020 •

edited

Loading

Dylan-DPC-zz commented Dec 8, 2020 •

edited

Loading