Example iterator fold vs. loop emit different code #99656

ghost · 2022-07-24T04:08:14Z

This simple example of a naive factorial (n!) implementation emits different code for using a range fold, and a loop over the iteration. Notably, the iterator fold emits many more instructions than the basic loop. This is unexpected, as this should be a zero-cost abstraction.

https://rust.godbolt.org/z/3cMMh67rE

type T = usize;

pub fn factorial1(n: u8) -> T {
 let mut y = 1;
 for k in 1..=n {
  y *= T::from(k);
 }
 y
}

pub fn factorial2(n: u8) -> T {
 (1..=n)
  .map(T::from)
  .fold(1, std::ops::Mul::mul)
}

@rustbot label I-heavy I-slow

The text was updated successfully, but these errors were encountered:

mqudsi · 2022-07-25T03:45:28Z

On your wasm target, this was nicely optimized in 1.33.0, 1.34.0 introduced a change, 1.56.0 made it much worse.

On x86, 1.34 actually improved the results considerably over 1.33.0. 1.42.0 has nice and compact asm but 1.43.0 messed that up.

We're probably dealing with multiple issues. I'd be surprised if at least one of them hasn't already been reported, but I'm not familiar enough with the backlog to be able to tell one way or the other.

@rustbot label +regression-from-stable-to-stable +A-codegen +T-compiler

apiraino · 2022-07-27T13:33:12Z

probably this also needs some bisection to start narrowing down where these codegen regressions come from. I'll try to signal this by adding:

@rustbot label E-needs-bisection

paolobarbolini · 2022-07-27T17:18:43Z

Notably, the iterator fold emits many more instructions than the basic loop

Could this simply be the case for factorial2 getting unrolled and even vectorized if the configuration allows for it, while factorial1 doesn't?

See https://rust.godbolt.org/z/vE51sGWY5

ghost · 2022-07-27T22:07:31Z

That sure does explain what is occurring! Yet, not why?

(i.e. Wouldn't we want them both vectorised for opt-level=3?)

paolobarbolini · 2022-07-27T22:13:49Z

If I use a while loop it works, so I guess something is preventing LLVM from optimizing the for loop

https://rust.godbolt.org/z/sha96Ga3a

apiraino · 2022-10-05T09:42:58Z

I'll nominate for T-compiler meeting for an opinion on how to proceed (e.g. understand how impacting this could be due to the unclear origin, possibly find an owner if it turns out to be worth being investigated soon)

@rustbot label I-compiler-nominated

the8472 · 2022-10-06T15:01:29Z

so I guess something is preventing LLVM from optimizing the for loop

Inclusive ranges optimize poorly. Switch to an exclusive range and the for-loop does the same unrolling as the iterator.
So this likely is a dup of #45222

apiraino · 2022-10-06T16:09:32Z

WG-prioritization assigning priority (Zulip discussion).

@rustbot label -I-prioritize +P-medium

apiraino · 2022-10-13T09:18:01Z

Discussed in T-compiler meeting (notes)

@rustbot label -p-high +p-medium

rustbot added I-heavy Issue: Problems and improvements with respect to binary size of generated code. I-slow Issue: Problems and improvements with respect to performance of generated code. labels Jul 24, 2022

rustbot added the E-needs-bisection Call for participation: This issue needs bisection: https://github.com/rust-lang/cargo-bisect-rustc label Jul 27, 2022

rustbot added the I-compiler-nominated Nominated for discussion during a compiler team meeting. label Oct 5, 2022

rustbot added P-medium Medium priority and removed I-prioritize Issue: Indicates that prioritization has been requested for this issue. labels Oct 6, 2022

rustbot added the P-high High priority label Oct 13, 2022

apiraino removed the P-medium Medium priority label Oct 13, 2022

rustbot added the P-medium Medium priority label Oct 13, 2022

apiraino removed P-high High priority I-compiler-nominated Nominated for discussion during a compiler team meeting. labels Oct 13, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Example iterator fold vs. loop emit different code #99656

Example iterator fold vs. loop emit different code #99656

ghost commented Jul 24, 2022

mqudsi commented Jul 25, 2022 •

edited

Loading

apiraino commented Jul 27, 2022 •

edited

Loading

paolobarbolini commented Jul 27, 2022 •

edited

Loading

ghost commented Jul 27, 2022

paolobarbolini commented Jul 27, 2022 •

edited

Loading

apiraino commented Oct 5, 2022

the8472 commented Oct 6, 2022

apiraino commented Oct 6, 2022

apiraino commented Oct 13, 2022 •

edited

Loading

Example iterator fold vs. loop emit different code #99656

Example iterator fold vs. loop emit different code #99656

Comments

ghost commented Jul 24, 2022

mqudsi commented Jul 25, 2022 • edited Loading

apiraino commented Jul 27, 2022 • edited Loading

paolobarbolini commented Jul 27, 2022 • edited Loading

ghost commented Jul 27, 2022

paolobarbolini commented Jul 27, 2022 • edited Loading

apiraino commented Oct 5, 2022

the8472 commented Oct 6, 2022

apiraino commented Oct 6, 2022

apiraino commented Oct 13, 2022 • edited Loading

mqudsi commented Jul 25, 2022 •

edited

Loading

apiraino commented Jul 27, 2022 •

edited

Loading

paolobarbolini commented Jul 27, 2022 •

edited

Loading

paolobarbolini commented Jul 27, 2022 •

edited

Loading

apiraino commented Oct 13, 2022 •

edited

Loading