Reduced performance when using question mark operator instead of `try!` #37939

birkenfeld · 2016-11-22T20:49:42Z

This was reported on the users forum , and I don't want it to get lost. Basically, replacing try! by ? resulted in ~20% performance loss in benchmarks:

ratel-rust/ratel-core#48 (comment)

I've reproduced, but not further investigated, these findings. Is that expected right now? It's not a good argument for adopting the question mark :)

The text was updated successfully, but these errors were encountered:

arthurprs · 2016-11-24T17:32:38Z

I didn't read the implementation until now, I though they expanded to the exact same thing as try!()

sfackler · 2016-11-24T18:37:36Z

Might just need to stick a couple of #[inline]s on Carrier methods.

StefanoD · 2016-11-26T12:39:07Z

I'm curious why this issue doesn't even get a label like I-Slow.

birkenfeld · 2016-12-03T22:14:52Z

@sfackler I tried this (#[inline] on the trait method definitions and the impl methods), but didn't seem to get any different bench numbers.

ghost · 2016-12-08T17:39:03Z

I've found out why this is slow. Suppose you have an expression res? of type Result<U, V>.
In librustc/hir/lowering.rs it will be desugared to:

match Carrier::translate(res) {
    Ok(val) => val,
    Err(err) => return Carrier::from_error(From::from(err)),
}

The real culprit is Carrier::translate. It contains a match expression itself, which doesn't get optimized away, even though it's implemented for Result<U, V> as the identity function (in src/libcore/ops.rs):

impl<U, V> Carrier for Result<U, V> {
    type Success = U;
    type Error = V;

    fn from_success(u: U) -> Result<U, V> {
        Ok(u)
    }

    fn from_error(e: V) -> Result<U, V> {
        Err(e)
    }

    fn translate<T>(self) -> T
        where T: Carrier<Success=U, Error=V>
    {
        match self {
            Ok(u) => T::from_success(u),
            Err(e) => T::from_error(e),
        }
    }
}

In lowering.rs we have this piece of desugaring process:

                    // Carrier::translate(<expr>)
                    let discr = {
                        // expand <expr>
                        let sub_expr = self.lower_expr(sub_expr);

                        let path = &["ops", "Carrier", "translate"];
                        let path = P(self.expr_std_path(unstable_span, path, ThinVec::new()));
                        P(self.expr_call(e.span, path, hir_vec![sub_expr]))
                    };

Suppose we remove translation and implement the piece simply like this:

                    // Carrier::translate(<expr>)
                    let discr = {
                        // expand <expr>
                        P(self.lower_expr(sub_expr))
                    };

If we do that, there will be no slowdown in ratel-core due to question mark operators (I benchmarked it).
Of course, this is a non-solution and the problem is that rustc and LLVM don't optimize away the redundant match statement.

I created a simple example that demonstrates the issue: https://is.gd/Q7PGzm
Compile into ASM in Release mode. Function identity compiles to:

identity:
	.cfi_startproc
	xorl	%eax, %eax
	cmpq	$0, (%rsi)
	movq	8(%rsi), %rcx
	setne	%al
	movq	%rax, (%rdi)
	movq	%rcx, 8(%rdi)
	movq	%rdi, %rax
	retq

You can see there are unncesseary cmpq and setne instructions. The function should ideally (in order to be fast) simply return the input argument using a few move instructions, that's all.

I'm a total beginner at rustc, so no idea how to proceed further. Should we perhaps detect identity functions within a MIR optimization pass?

bluss · 2016-12-11T18:21:49Z

@arielb1 I've implemented a test version of the "variant copy prop" by hacking into CopyPropagation. However @eddyb has unfortunately 😉 said that it's much better to make a more general version of that that can propagate copies of aggregates (somehow).

I just switched a couple inner loop ?s back to try!(...) to work around rust-lang/rust#37939

@nikomatsakis

Lower `?` to `Try` instead of `Carrier` The easy parts of rust-lang/rfcs#1859, whose FCP completed without further comments. Just the trait and the lowering -- neither the error message improvements nor the insta-stable impl for Option nor exhaustive docs. Based on a [github search](https://github.com/search?l=rust&p=1&q=question_mark_carrier&type=Code&utf8=%E2%9C%93), this will break the following: - https://github.com/pfpacket/rust-9p/blob/00206e34c680198a0ac7c2f066cc2954187d4fac/src/serialize.rs#L38 - https://github.com/peterdelevoryas/bufparse/blob/b1325898f4fc2c67658049196c12da82548af350/src/result.rs#L50 The other results appear to be files from libcore or its tests. I could also leave Carrier around after stage0 and `impl<T:Carrier> Try for T` if that would be better. r? @nikomatsakis Edit: Oh, and it might accidentally improve perf, based on rust-lang#37939 (comment), since `Try::into_result` for `Result` is an obvious no-op, unlike `Carrier::translate`.

@nikomatsakis

Lower `?` to `Try` instead of `Carrier` The easy parts of rust-lang/rfcs#1859, whose FCP completed without further comments. Just the trait and the lowering -- neither the error message improvements nor the insta-stable impl for Option nor exhaustive docs. Based on a [github search](https://github.com/search?l=rust&p=1&q=question_mark_carrier&type=Code&utf8=%E2%9C%93), this will break the following: - https://github.com/pfpacket/rust-9p/blob/00206e34c680198a0ac7c2f066cc2954187d4fac/src/serialize.rs#L38 - https://github.com/peterdelevoryas/bufparse/blob/b1325898f4fc2c67658049196c12da82548af350/src/result.rs#L50 The other results appear to be files from libcore or its tests. I could also leave Carrier around after stage0 and `impl<T:Carrier> Try for T` if that would be better. r? @nikomatsakis Edit: Oh, and it might accidentally improve perf, based on rust-lang#37939 (comment), since `Try::into_result` for `Result` is an obvious no-op, unlike `Carrier::translate`.

kdy1 · 2017-09-28T13:14:49Z

I think this issue is fixed.

Generated asm is same for try! and ?.

https://play.rust-lang.org/?gist=625d88df305ace951a088e1cee2ec13a

Edit: Typo

StefanoD · 2017-09-28T13:22:43Z

Even if this is fixed: Is there a unit test which prevents regression?

kennytm · 2017-09-28T13:48:17Z

We could have a codegen test for this.

Someone tag this as E-needstest please...

bluss · 2017-09-28T18:09:08Z

It looks like they both have the same bad code generation now, needs looking into.

bluss · 2017-09-28T19:45:05Z

Example where it is compared with the identity function. Returning Ok(x?) is the same as the identity for the result. playground link

But it seems this example is not a good condensation of the issue, because I can't find any previous Rust version (in rust.godbolt) that has the desired identity function code gen even for the try!() macro.

It doesn't show the difference between try and ?, but it shows something we can fix to improve them both.

Edited: Update to another example code link (configurable Result type).

arielb1 · 2017-09-28T20:09:42Z

@bluss

I think https://reviews.llvm.org/D37216 should fix this, but it's a little stuck in the LLVM review queue.

mati865 · 2018-02-01T09:02:56Z

@arielb1 it was reverted llvm-mirror/llvm@c87c1c0.

kornelski · 2018-04-24T19:50:44Z

Still not fixed (tested nightly on playpen)

https://play.rust-lang.org/?version=nightly&mode=release&edition=2015&gist=b3858efebcadbffcbe2155e3f2750d07

mehcode · 2018-08-31T12:58:26Z

This seems to be fixed (in stable). Tried the play link above.

; Function Attrs: noinline norecurse nounwind readnone uwtable
define { i64, i64 } @try_op(i64, i64) unnamed_addr #2 {
  %3 = tail call { i64, i64 } @try_macro(i64 %0, i64 %1) #2
  ret { i64, i64 } %3
}

try_op:
	jmp	try_macro

try_macro:
	xorl	%eax, %eax
	testq	%rdi, %rdi
	setne	%al
	movq	%rsi, %rdx
	retq

bluss · 2018-12-01T15:31:24Z

You can still provoke it to make a difference and introduce conditionals or more copies for the ? operator than for the try!() macro if you change the payload types for the Result, for example using (i32, i32) or String.

type T = (i32, i32);
type E = T;
type R = Result<T, E>;

#[no_mangle]
pub fn try_op(a: R) -> R {
    Ok(a?)
}

#[no_mangle]
pub fn try_macro(a: R) -> R {
    Ok(try!(a))
}

mati865 · 2019-04-03T12:19:15Z

Looks like it regressed: https://godbolt.org/z/awKD_U

mati865 · 2022-08-17T14:57:54Z

We are fast again (at least for the examples given above) since Rust 1.52 which contains LLVM 12 upgrade!

Once somebody adds test (or confirms that similar test already exists) this issue could be closed.

Kobzol · 2022-08-17T17:23:02Z

There's this codegen test, but I'm not sure if it's enough to check for this.

StefanoD · 2022-08-17T18:09:46Z

According to this blog post, Rust has to this date (Rust 1.62.1) performance problems with the question mark operator, resulting in a 4% performance loss.

scottmcm · 2022-08-17T18:15:08Z

but I'm not sure if it's enough to check for this.

Definitely not -- you'll notice it's both -Zunsound-mir-opts and FIXME: broken.

But there's good news! LLVM 15 merged a few days ago, bringing the fix mentioned in #85133 (comment).

With that, both of these are nops now:

#![feature(try_blocks)]

pub fn result_nop_match(x: Result<i32, u32>) -> Result<i32, u32> {
    match x {
        Ok(x) => Ok(x),
        Err(x) => Err(x),
    }
}

pub fn result_nop_traits(x: Result<i32, u32>) -> Result<i32, u32> {
    try {
        x?
    }
}

https://rust.godbolt.org/z/71dYnrMf6

example::result_nop_match:
        mov     rax, rdi
        ret

example::result_nop_traits:
        mov     rax, rdi
        ret

For comparison, on 1.63 even the match version needs a bunch of code: https://rust.godbolt.org/z/oarec37sG

example::result_nop_match:
        xor     ecx, ecx
        test    edi, edi
        setne   cl
        movabs  rax, -4294967296
        and     rax, rdi
        or      rax, rcx
        ret

EDIT: I opened #100693 to have a codegen test here.

JohnTitor · 2022-12-21T10:45:01Z

Triage: #100693 added the test for this case (thank you @scottmcm!), closing as fixed.

maciejhirsz mentioned this issue Nov 28, 2016

use questionmark operator instead of try! ratel-rust/ratel-core#48

Closed

dotdash added the I-slow Issue: Problems and improvements with respect to performance of generated code. label Dec 8, 2016

bluss mentioned this issue Dec 11, 2016

suggest ? over try! rust-lang/rust-clippy#1361

Closed

This was referenced Dec 12, 2016

WIP Switch x? desugaring to use QuestionMark trait and Try enum #38301

Closed

Identity Result match mapping should optimize to an identity function #38349

Open

petrochenkov mentioned this issue Feb 23, 2017

extend ? to operate over other types rust-lang/rfcs#1859

Merged

scottlamb added a commit to scottlamb/moonfire-nvr that referenced this issue Feb 27, 2017

improve build_index performance by 5-10%

15609dd

I just switched a couple inner loop ?s back to try!(...) to work around rust-lang/rust#37939

scottmcm mentioned this issue May 28, 2017

Lower ? to Try instead of Carrier #42275

Merged

philipc mentioned this issue Jun 13, 2017

Run rustfmt 0.8.6 gimli-rs/gimli#205

Merged

yrashk mentioned this issue Jul 4, 2017

Problem: instructions handling macros are obscure PumpkinDB/PumpkinDB#321

Merged

BusyJay mentioned this issue Jul 17, 2017

*: Add grpc interfaces used by transaction debugger tikv/tikv#2012

Merged

1 task

Mark-Simulacrum added the C-enhancement Category: An issue proposing an enhancement or a PR with one. label Jul 26, 2017

Mark-Simulacrum added the E-needs-test Call for participation: An issue has been fixed and does not reproduce, but no test has been added. label Sep 29, 2017

matthewkmayer mentioned this issue Jan 3, 2018

Ensure error responses from Ceph are parsed properly rusoto/rusoto#892

Merged

dtolnay mentioned this issue Mar 20, 2018

Support for Flattening serde-rs/serde#1179

Merged

8 tasks

dtolnay mentioned this issue Apr 24, 2018

Replace try!() with ? racer-rust/racer#843

Merged

kennytm added the WG-llvm Working group: LLVM backend code generation label Apr 24, 2018

kennytm added the E-needs-test Call for participation: An issue has been fixed and does not reproduce, but no test has been added. label Aug 31, 2018

nikic added the A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. label Dec 1, 2018

This was referenced May 6, 2019

convert custom try macro to ? #60581

Merged

t! macro can be replaced by '?' operator #60580

Closed

ghost mentioned this issue May 21, 2019

Deprecate try! macro in favor of ? #61000

Closed

mominul mentioned this issue Jul 14, 2019

The essence of lexer #59706

Merged

matklad mentioned this issue Jul 15, 2019

Deprecate try! macro #62672

Merged

ordian mentioned this issue Aug 5, 2019

Change the return type of step_inner function. openethereum/parity-ethereum#10940

Merged

38 mentioned this issue Oct 17, 2019

[BUG] Investigate the performance overhead of result types plotters-rs/plotters#58

Closed

ghost mentioned this issue Nov 10, 2019

Optimize out nop-matches #66234

Open

matklad mentioned this issue Jul 5, 2021

refactor(types): change account id to newtype and deprecate ValidAccountId near/near-sdk-rs#448

Merged

This was referenced Aug 17, 2022

Add codegen tests for identity matching results #100692

Closed

Result<u32, u32> uses less efficient ABI than Result<i32, u32> #100698

Open

toyboot4e mentioned this issue Sep 27, 2022

Remove ? operators from proc-macro generated code FyroxEngine/Fyrox#369

Merged

jonathanpwang mentioned this issue Nov 20, 2022

[WIP] Faster create_proof halo2-ce/halo2#12

Closed

JohnTitor closed this as completed Dec 21, 2022

Lucretiel mentioned this issue Jan 21, 2023

Use ? instead of try! internally and in generated code serde-rs/serde#2366

Open

c410-f3r mentioned this issue Sep 3, 2023

match-then-remake Result doesn't optimize away for some payload widths #101210

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reduced performance when using question mark operator instead of `try!` #37939

Reduced performance when using question mark operator instead of `try!` #37939

birkenfeld commented Nov 22, 2016 •

edited

Loading

arthurprs commented Nov 24, 2016

sfackler commented Nov 24, 2016

StefanoD commented Nov 26, 2016

birkenfeld commented Dec 3, 2016 •

edited

Loading

ghost commented Dec 8, 2016 •

edited by ghost

Loading

bluss commented Dec 11, 2016

kdy1 commented Sep 28, 2017 •

edited

Loading

StefanoD commented Sep 28, 2017

kennytm commented Sep 28, 2017

bluss commented Sep 28, 2017

bluss commented Sep 28, 2017 •

edited

Loading

arielb1 commented Sep 28, 2017

mati865 commented Feb 1, 2018 •

edited

Loading

kornelski commented Apr 24, 2018 •

edited

Loading

mehcode commented Aug 31, 2018

bluss commented Dec 1, 2018

mati865 commented Apr 3, 2019

mati865 commented Aug 17, 2022 •

edited

Loading

Kobzol commented Aug 17, 2022

StefanoD commented Aug 17, 2022

scottmcm commented Aug 17, 2022 •

edited

Loading

JohnTitor commented Dec 21, 2022 •

edited

Loading

Reduced performance when using question mark operator instead of try! #37939

Reduced performance when using question mark operator instead of try! #37939

Comments

birkenfeld commented Nov 22, 2016 • edited Loading

arthurprs commented Nov 24, 2016

sfackler commented Nov 24, 2016

StefanoD commented Nov 26, 2016

birkenfeld commented Dec 3, 2016 • edited Loading

ghost commented Dec 8, 2016 • edited by ghost Loading

bluss commented Dec 11, 2016

kdy1 commented Sep 28, 2017 • edited Loading

StefanoD commented Sep 28, 2017

kennytm commented Sep 28, 2017

bluss commented Sep 28, 2017

bluss commented Sep 28, 2017 • edited Loading

arielb1 commented Sep 28, 2017

mati865 commented Feb 1, 2018 • edited Loading

kornelski commented Apr 24, 2018 • edited Loading

mehcode commented Aug 31, 2018

bluss commented Dec 1, 2018

mati865 commented Apr 3, 2019

mati865 commented Aug 17, 2022 • edited Loading

Kobzol commented Aug 17, 2022

StefanoD commented Aug 17, 2022

scottmcm commented Aug 17, 2022 • edited Loading

JohnTitor commented Dec 21, 2022 • edited Loading

Reduced performance when using question mark operator instead of `try!` #37939

Reduced performance when using question mark operator instead of `try!` #37939

birkenfeld commented Nov 22, 2016 •

edited

Loading

birkenfeld commented Dec 3, 2016 •

edited

Loading

ghost commented Dec 8, 2016 •

edited by ghost

Loading

kdy1 commented Sep 28, 2017 •

edited

Loading

bluss commented Sep 28, 2017 •

edited

Loading

mati865 commented Feb 1, 2018 •

edited

Loading

kornelski commented Apr 24, 2018 •

edited

Loading

mati865 commented Aug 17, 2022 •

edited

Loading

scottmcm commented Aug 17, 2022 •

edited

Loading

JohnTitor commented Dec 21, 2022 •

edited

Loading